Swapping OCR Representations in KTM

This isn’t a post about which OCR engine is better. Sometimes FineReader will do a better job, sometimes RecoStar shines. In fact, sometimes you’d love to work with the result of both engines. Kofax Transformation however only supports one full-page engine being present. Sure, there’s the OCR re-read option, but that’s tied to a field. What if you want your format locators to run on FineReader, but fire your database locators against RecoStar’s results?

Well, I know how to modify and handle representations – basically the object model Kofax Transformations is storing OCR results. Representations contain of text lines, pages, individual words, which themselves have coordinates, text, and so on. However, while I’ve been working with Kofax products for more than 10 years by now, only until today I learned how to fire OCR in script (thanks to Brendan’s post in the LinkedIn user group). So, I wanted to share the results. Today’s questions are:

Can we perform OCR in script?
Can we tell the locators which OCR results to use?

Firing more than one OCR engine

By default, this is what you get when you process a document in Transformations:

Our document was processed with the default settings, so we have Abbyy’s FineReader results. There’s one representation by default, and all locators will use the respective results. However, Kofax ships the components required to perform OCR in script with Transformations. Make sure to add the following references to the script:

The recognizer objects have a Recognize method. Just provide an xdoc, a page profile, and a page number – and you’ll end up with a new OCR representation. There’s a catch, however – this method will only update the very first representation in the xdoc – plus, it does not seems to clear the representation first. Long story short, here’s the script that allows you to perform OCR twice:


Private Sub Document_BeforeExtract(ByVal pXDoc As CASCADELib.CscXDocument)

   Dim repFR As CscXDocRepresentation
   Dim repRS As CscXDocRepresentation
   Dim recognizerFR As New MpsPageRecognizerFR
   Dim recognizerRS As New MpsPageRecognizerRecoStar
   Dim idxPg As Long

   ' fire up OCR engines, first finereader, then recostar
   XDocument_ClearAllRepresentations(pXDoc)
   For idxPg = 0 To pXDoc.CDoc.Pages.Count - 1
      recognizerFR.Recognize(pXDoc, Project.RecogProfiles.ItemByName("FineReader"), idxPg)
   Next
   Set repFR = pXDoc.Representations(0)

   XDocument_ClearAllRepresentations(pXDoc)
   For idxPg = 0 To pXDoc.CDoc.Pages.Count - 1
      recognizerRS.Recognize(pXDoc, Project.RecogProfiles.ItemByName("RecoStar"), idxPg)
   Next
   Set repRS = pXDoc.Representations(0)

   ' finally, remove all reps again and repopulate them, in any order to your liking
   XDocument_ClearAllRepresentations(pXDoc)
   Representation_Copy(repFR, pXDoc.Representations.Create("RecoStar"))
   Representation_Copy(repRS, pXDoc.Representations.Create("FineReader"))
   pXDoc.Save

End Sub


Public Sub XDocument_ClearAllRepresentations(pXDoc As CscXDocument)
   ' helper to remove all existing representations form an xdocument
   Dim i As Long
   For i = pXDoc.Representations.Count-1 To 0 Step-1
      pXDoc.Representations.Remove(i)
   Next
End Sub


Public Sub Representation_Copy(fromRep As CscXDocRepresentation, toRep As CscXDocRepresentation)
   ' copies one representation to another
   Dim idxPg As Long
   Dim idxWord As Long

   For idxPg = 0 To fromRep.Pages.Count - 1
      For idxWord = 0 To fromRep.Pages(idxPg).Words.Count - 1
         toRep.Pages(idxPg).AddWord(Word_Create(fromRep.Pages(idxPg).Words(idxWord)))
      Next
   Next
   toRep.AnalyzeLines
End Sub

Now, let’s have a look at this xDoc again – you’ll notice that we ended up with the results of both engines. You’ll notice some differences, in our example RecoStar has found 89 words, while FineReader contains 85. By default, all locators will use the results from the first representation at index 0 – which is, in this case, FineReader.

xdoc-two-reps — Two representations in one xdoc!

Here’s an example: in the second representation, RecoStar interpreted a checkbox as a strangely looking word. When firing a format locator with exactly this string, we won’t be seeing any results:

As expected, when testing the locator we end up with nothing:

Again, this is just to illustrate that the second representation does not matter – yet. No worries, we’ll make it matter.

Swapping Representations

Good news is – we already have what we need. The script above already contains a helper to copy a representation – so, swapping them seems easy enough:


Public Sub XDocument_SwapRepresentations(pXDoc As CscXDocument)
   ' swaps the first two representations of an xdoc (will only work when there are exactly two reps)
   Dim rep0 As CscXDocRepresentation
   Dim rep1 As CscXDocRepresentation
   Dim n0 As String
   Dim n1 As String

   If pXDoc.Representations.Count = 2 Then
      Set rep0 = pXDoc.Representations(0)
      n0 = pXDoc.Representations(0).Name
      Set rep1 = pXDoc.Representations(1)
      n1 = pXDoc.Representations(1).Name
      ' rename the "old" representations
      rep0.Name += "_old"
      rep1.Name += "_old"
      ' copy and remove the old reps
      Representation_Copy(rep1, pXDoc.Representations.Create(n1))
      Representation_Copy(rep0, pXDoc.Representations.Create(n0))
      pXDoc.Representations.Remove(0)
      pXDoc.Representations.Remove(0)
   End If

End Sub

Note that this script only works when there are exactly two representations in the xdoc. But where to call it? Easy enough – as locators are performed exactly in the order as they appear in Project Builder, just call the swap helper right before the required locator. In our simplified use case, that’s just right before our format locator.

I won’t bore you with the contents of the script locator as it’s easy enough – just call XDocument_SwapRepresentations. You can, in fact, call it as often as you like (or need) to. So, here’s the result:

swapped-reps — Swapped representations – locators will use the first rep at index 0.

And as expected, when we fire up the format locator we get a 100%-hit:

That’s all you need to do if you wanted to use results from more than one OCR engine. Please note that firing two engines will likely reduce the page count twice, however I did not verify that.

Zany Zone (aka side notes about the Script)

Why the Words_Create helper? It seems that you can not add one representations word to another representation. From what I’ve learned that is related to some representation-specific properties, such as the IndexInBlock, IndexInTextLine, IndexOnDocument and IndexOnPage. All these indices are re-calculated when firing the respective method (in our case, AnalyzeLines is sufficient). Hence the helper – here, we create a new word with the defaults for all indices (i.e. -1).
Why does swapping only work with 2 alternatives? Well, because I coded it that way. I wanted some quick results. Feel free to improve the method, and please send me the improved version 😉
Do I have to pay twice, once for every page? I really don’t know, and it’s hard to say with a developer’s license. Give it a shot and let me know!

Quipu Blog

Kofax Subject Matter Experts

Swapping OCR Representations in KTM

Firing more than one OCR engine

Swapping Representations

Zany Zone (aka side notes about the Script)