A list of API-based AI systems.

This page lists some of the most popular API-based AI systems for various popular tasks. There are lots of great solutions out there, but sometimes it can be hard to pick the one that is most appropriate for your use case. If you’d like some advice on which one best fits your use case, reach out to us through the form below and we’d be happy to help out!

Also, if you notice that we’re missing a popular API-based AI system, or any of the information below is out-of-date, please get in touch at any time!

Image/Video Recognition

These APIs perform image or video recognition, i.e. they take an image or video as input and return a label or set of labels that describe the image or video. There are a number of tasks that they tend to focus on, including:

  • Standard object detection: detecting a standard set of objects, such as people, cars, **and animals.
  • Custom object detection: detecting objects that are specified by the users.
  • Face detection: detecting faces in images.
  • Face recognition: comparing two faces to see if they are the same person.
  • Content moderation: detecting inappropriate content, such as nudity or violence.
  • Video analysis: specifically detecting characteristics of video.
Service Standard objects Custom objects Face detection Face recognition Content moderation Video analysis Notes
Amazon Rekognition X X X X X X  
Imagga X X X X X   Supports on-premise deployment
Clarifai X X X   X    
Google Cloud Vision X X          
Microsoft Azure Computer Vision X X X X     Extensive certification for government work

Natural Language Understanding

These APIs cover tasks that involve understanding natural language such as English. They can be used to perform tasks such as:

  • Text classification: classifying a piece of text into a set of custom-specified categories.
  • Sentiment analysis: a popular variety of text classification determining whether a piece of text is positive, negative, or neutral.
  • Entity recognition: identifying entities in a piece of text, such as people, places, and organizations.
  • Event recognition: identifying events in a piece of text, or “who did what to whom”.
  • Syntax analysis: identifying the grammatical structure of a piece of text, such as the parts of speech and the syntactic dependencies between words.
  • Search: search for text matching a query.
  • Summarization: summarize a longer text into a shorter text by extracting content or generating a summary from scratch.

There are a number of options for each of these.

Service Languages covered Text class. Sentiment Entities Events Syntax Search Summarization Notes
Amazon Comprehend 12 languages X X X X X   X  
Clarifai English X X X          
Google Cloud Natural Language 12 languages X X X   X     Has [healthcare-specific API](https://cloud.google.com/natural-language/healthcare/
IBM Watson Natural Language Understanding 23 languages X X X X X X   Supports on-premise deployment.
Microsoft Azure Text Analytics 1-115 languages (task dependent) X X X       X Has medical text models.
Aylien 5 languages X X X         Focuses on news processing.

Speech Processing

Speech recognition, or speech-to-text, converts spoken words into textual transcripts. Conversely, speech synthesis, or text-to-speech, converts text into spoken words.

Machine Translation

Machine translation is the task of translating text from one language to another.

Optical Character Recognition (OCR)

Optical character recognition (OCR) is the process of converting images of text into machine-readable text. It is also often important to recognize the format of the text, particularly for structured documents like tables.