Using the Language Client
Documents
The Google Natural Language API has the following supported methods:
and each method uses a Document for representing
text.
>>> document = language.types.Document( ... content='Google, headquartered in Mountain View, unveiled the ' ... 'new Android phone at the Consumer Electronic Show. ' ... 'Sundar Pichai said in his keynote that users love ' ... 'their new Android phones.', ... language='en', ... type='PLAIN_TEXT', ... )
The document’s language defaults to None, which will cause the API to
auto-detect the language.
In addition, you can construct an HTML document:
>>> html_content = """\ ... <html> ... <head> ... <title>El Tiempo de las Historias</time> ... </head> ... <body> ... <p>La vaca saltó sobre la luna.</p> ... </body> ... </html> ... """ >>> document = language.types.Document( ... content=html_content, ... language='es', ... type='HTML', ... )
The language argument can be either ISO-639-1 or BCP-47 language
codes. The API reference page contains the full list of supported languages.
In addition to supplying the text / HTML content, a document can refer to content stored in Google Cloud Storage.
>>> document = language.types.Document( ... gcs_content_uri='gs://my-text-bucket/sentiment-me.txt', ... type=language.enums.HTML, ... )
Analyze Entities
The analyze_entities()
method finds named entities (i.e. proper names) in the text. This method
returns a AnalyzeEntitiesResponse.
>>> document = language.types.Document( ... content='Michelangelo Caravaggio, Italian painter, is ' ... 'known for "The Calling of Saint Matthew".', ... type=language.enums.Document.Type.PLAIN_TEXT, ... ) >>> response = client.analyze_entities( ... document=document, ... encoding_type='UTF32', ... ) >>> for entity in response.entities: ... print('=' * 20) ... print(' name: {0}'.format(entity.name)) ... print(' type: {0}'.format(entity.type)) ... print(' metadata: {0}'.format(entity.metadata)) ... print(' salience: {0}'.format(entity.salience)) ==================== name: Michelangelo Caravaggio type: PERSON metadata: {'wikipedia_url': 'https://en.wikipedia.org/wiki/Caravaggio'} salience: 0.7615959 ==================== name: Italian type: LOCATION metadata: {'wikipedia_url': 'https://en.wikipedia.org/wiki/Italy'} salience: 0.19960518 ==================== name: The Calling of Saint Matthew type: EVENT metadata: {'wikipedia_url': 'https://en.wikipedia.org/wiki/The_Calling_of_St_Matthew_(Caravaggio)'} salience: 0.038798928
NOTE: It is recommended to send an encoding_type argument to Natural
Language methods, so they provide useful offsets for the data they return.
While the correct value varies by environment, in Python you usually
want UTF32.
Analyze Sentiment
The analyze_sentiment() method
analyzes the sentiment of the provided text. This method returns a
AnalyzeSentimentResponse.
>>> document = language.types.Document( ... content='Jogging is not very fun.', ... type='PLAIN_TEXT', ... ) >>> response = client.analyze_sentiment( ... document=document, ... encoding_type='UTF32', ... ) >>> sentiment = response.document_sentiment >>> print(sentiment.score) -1 >>> print(sentiment.magnitude) 0.8
NOTE: It is recommended to send an encoding_type argument to Natural
Language methods, so they provide useful offsets for the data they return.
While the correct value varies by environment, in Python you usually
want UTF32.
Analyze Entity Sentiment
The analyze_entity_sentiment()
method is effectively the amalgamation of
analyze_entities() and
analyze_sentiment().
This method returns a
AnalyzeEntitySentimentResponse.
>>> document = language.types.Document(
... content='Mona said that jogging is very fun.',
... type='PLAIN_TEXT',
... )
>>> response = client.analyze_entity_sentiment(
... document=document,
... encoding_type='UTF32',
... )
>>> entities = response.entities
>>> entities[0].name
'Mona'
>>> entities[1].name
'jogging'
>>> entities[1].sentiment.magnitude
0.8
>>> entities[1].sentiment.score
0.8
NOTE: It is recommended to send an encoding_type argument to Natural
Language methods, so they provide useful offsets for the data they return.
While the correct value varies by environment, in Python you usually
want UTF32.
Annotate Text
The annotate_text() method
analyzes a document and is intended for users who are familiar with
machine learning and need in-depth text features to build upon. This method
returns a AnnotateTextResponse.