1. Overview

1. Overview

The Cognitiv+ Analytics Engine offers an API through which 3rd party applications can use its contract analytics capabilities. It receives a document (in docx, pdf, xml or html format) and produces a number of different results that can be recovered by the user.

In an overview as sample architecture using the API is depicted below.

Below is an overview of all the endpoints offered by the API

Document History Arguments Description

/login

• Email

• Password

Used to authenticate/authorize the user that is going to use the other endpoints, by means of an authorization token.

/process

 

• onlyPreprocess

• forceOcr

• file

• modelsList

Will process the file uploaded. There is an option to run pre-processing without NLP analysis and force OCR even on machine readable pdfs. Options allow to override the default list of models to be run.

/analyze

• Xml file

• contractType

• modelsList

Will perform the NLP analysis on the xml file provided. This file is the one produced by pre-processing. Options allow to force the selection of the file type and override the default list of models to be run.

/get-file-data

• processing_id Retrieves the data produced by the NLP analysis. That can be the Insights, Index, Obligations, etc.

/get-machine-readable-pdf

• processing_id Retrieves the machine-readable pdf, which is a result of pre-processing.

/get-file-xml

• processing_id Retrieves the contents of the file in xml, which is a result of pre-processing.

/get-file-html

• processing_id Retrieves the contents of the file in html. This can be simply be the xml file converted to html or an annotated html with the results of the analysis (according to the “annotations” argument.

/extract-annotations

• processing_id

• html file

Accepts the annotated html, extracts the annotations in json and converts the contents in xml.

/get-section-data

• processing_id Retrieves the sections of a document.

/get-section-xml

• section_id Retrieves a specific section in xml

/merge-xml-data

xml file

• list of NLP findings in json format

Merges xml with NLP produced data (insights, structure e.t.c.) into an annotated HTML file

/analyze-html

• HTML file (should be saved in UTF-8 encoding)

• contractType

• modelsList

Will perform the NLP analysis on the HTML file provided. Options allow to force the selection of the file type and override the default list of models to be run

 

The API requires authentication, performed through the Login endpoint, and then it allows for the transfer of a file to be analyzed. The processing can be long running, so the client needs to monitor for the status of the request using the unique job id.

api diagrams-08

The dependencies of each are shown below

Document History Depends on Can be used afterwards

/login

 

All other endpoints

/process

 

 

All get-* endpoints

/analyze

/process
/get-file-xml

 

/get-file-data: To retrieve analysis results /get-file-html: To retrieve the contents of the file in html annotated with the results /get-section-data: To get the NLP sections of a specific document

/get-file-data

/process
(onlyPreprocess=False)
Or

 

/get-machine-readable-pdf

/process

 

/get-file-xml

/process

/analyze: analyse the document using NLP

/get-file-html

/process
* Can’t use “annotations” argument
if /process?onlyPreprocess=True

Or

/analyze

/extract-annotations: To turn the annotated
html into xml and a separate structure with
the annotated data.
/extract-annotations /get-file-html  
/get-section-data

/process
(onlyPreprocess=False)

Or

/analyze

/get-section-xml: To get a section of a
document as an xml given the sectionID.
/get-section-xml /get-section-data  
/merge-xml-data /get-file-xml
/get-section-data
 
/analyze-html   /get-file-data: To get the NLP findings of a
specific document /get-section-data: To get
the NLP sections of a specific document