Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 38 additions & 1 deletion speech/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,53 @@ Configure your project using [Application Default Credentials][adc]

## Usage

To run the Speech Samples:

$ php speech.php

Cloud Speech

Usage:
command [options] [arguments]

Options:
-h, --help Display this help message
-q, --quiet Do not output any message
-V, --version Display this application version
--ansi Force ANSI output
--no-ansi Disable ANSI output
-n, --no-interaction Do not ask any interactive question
-v|vv|vvv, --verbose Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug

Available commands:
help Displays help for a command
list Lists commands
transcribe Transcribe an audio file using Google Cloud Speech API
transcribe-async Transcribe an audio file asynchronously using Google Cloud Speech API
transcribe-async-gcs Transcribe audio asynchronously from a Storage Object using Google Cloud Speech API
transcribe-async-words Transcribe an audio file asynchronously and print word time offsets using Google Cloud Speech API
transcribe-gcs Transcribe audio from a Storage Object using Google Cloud Speech API
transcribe-stream Transcribe a stream of audio using Google Cloud Speech API
transcribe-words Transcribe an audio file and print word time offsets using Google Cloud Speech API

Once you have a speech sample in the proper format, send it through the speech
API using the transcribe command:

```sh
php speech.php transcribe test/data/audio32KHz.raw --encoding LINEAR16 --sample-rate 32000
php speech.php transcribe test/data/audio32KHz.flac --encoding FLAC --sample-rate 32000 --async
php speech.php transcribe-async test/data/audio32KHz.flac --encoding FLAC --sample-rate 32000
php speech.php transcribe-words test/data/audio32KHz.flac --encoding FLAC --sample-rate 32000

```
## Troubleshooting

If you get the following error, set the environment variable `GCLOUD_PROJECT` to your project ID:

```
[Google\Cloud\Core\Exception\GoogleException]
No project ID was provided, and we were unable to detect a default project ID.
```

If you have not set a timezone you may get an error from php. This can be resolved by:

1. Finding where the php.ini is stored by running php -i | grep 'Configuration File'
Expand Down
4 changes: 3 additions & 1 deletion speech/composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@
"src/streaming_recognize.php",
"src/transcribe_async.php",
"src/transcribe_async_gcs.php",
"src/transcribe_async_words.php",
"src/transcribe_sync.php",
"src/transcribe_sync_gcs.php"
"src/transcribe_sync_gcs.php",
"src/transcribe_sync_words.php"
]
},
"require-dev": {
Expand Down
218 changes: 153 additions & 65 deletions speech/speech.php
Original file line number Diff line number Diff line change
Expand Up @@ -21,90 +21,178 @@
use Symfony\Component\Console\Application;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputArgument;
use Symfony\Component\Console\Input\InputDefinition;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Input\InputOption;
use Symfony\Component\Console\Output\OutputInterface;

$inputDefinition = new InputDefinition([
new InputArgument('audio-file', InputArgument::REQUIRED, 'The audio file to transcribe'),
new InputOption('encoding', null, InputOption::VALUE_REQUIRED,
'The encoding of the audio file. This is required if the encoding is ' .
'unable to be determined. '
),
new InputOption('language-code', null, InputOption::VALUE_REQUIRED,
'The language code for the language used in the source file. ',
'en-US'
),
new InputOption('sample-rate', null, InputOption::VALUE_REQUIRED,
'The sample rate of the audio file in hertz. This is required ' .
'if the sample rate is unable to be determined. '
),
new InputOption('sample-rate', null, InputOption::VALUE_REQUIRED,
'The sample rate of the audio file in hertz. This is required ' .
'if the sample rate is unable to be determined. '
),
]);

$application = new Application('Cloud Speech');
$application->add(new Command('transcribe'))
->setDescription('Transcribe Audio using Google Cloud Speech API')
->setDefinition($inputDefinition)
->setDescription('Transcribe an audio file using Google Cloud Speech API')
->setHelp(<<<EOF
The <info>%command.name%</info> command transcribes audio using the Google Cloud Speech API.
The <info>%command.name%</info> command transcribes audio from a file using the
Google Cloud Speech API.

<info>php %command.full_name% audio_file.wav</info>

EOF
)
->addArgument(
'audio-file',
InputArgument::REQUIRED,
'The audio file to transcribe'
)
->addOption(
'encoding',
null,
InputOption::VALUE_REQUIRED,
'The encoding of the audio file. This is required if the encoding is ' .
'unable to be determined. '
)
->addOption(
'language-code',
null,
InputOption::VALUE_REQUIRED,
'The language code for the language used in the source file. ',
'en-US'
)
->addOption(
'sample-rate',
null,
InputOption::VALUE_REQUIRED,
'The sample rate of the audio file in hertz. This is required ' .
'if the sample rate is unable to be determined. '
->setCode(function (InputInterface $input, OutputInterface $output) {
$audioFile = $input->getArgument('audio-file');
$languageCode = $input->getOption('language-code');
transcribe_sync($audioFile, $languageCode, [
'encoding' => $input->getOption('encoding'),
'sampleRateHertz' => $input->getOption('sample-rate'),
]);
});

$application->add(new Command('transcribe-gcs'))
->setDefinition($inputDefinition)
->setDescription('Transcribe audio from a Storage Object using Google Cloud Speech API')
->setHelp(<<<EOF
The <info>%command.name%</info> command transcribes audio from a Cloud Storage
Object using the Google Cloud Speech API.

<info>php %command.full_name% gs://my-bucket/audio_file.wav</info>

EOF
)
->addOption(
'async',
null,
InputOption::VALUE_NONE,
'Run the transcription asynchronously. '
->setCode(function (InputInterface $input, OutputInterface $output) {
$audioFile = $input->getArgument('audio-file');
$languageCode = $input->getOption('language-code');
if (!preg_match('/^gs:\/\/([a-z0-9\._\-]+)\/(\S+)$/', $audioFile, $matches)) {
throw new \Exception('Invalid file name. Must be gs://[bucket]/[audiofile]');
}
list($bucketName, $objectName) = array_slice($matches, 1);
transcribe_sync_gcs($bucketName, $objectName, $languageCode, [
'encoding' => $input->getOption('encoding'),
'sampleRateHertz' => $input->getOption('sample-rate'),
]);
});

$application->add(new Command('transcribe-words'))
->setDefinition($inputDefinition)
->setDescription('Transcribe an audio file and print word time offsets using Google Cloud Speech API')
->setHelp(<<<EOF
The <info>%command.name%</info> command transcribes audio from a file using the
Google Cloud Speech API and prints word time offsets.

<info>php %command.full_name% audio_file.wav</info>

EOF
)
->addOption(
'stream',
null,
InputOption::VALUE_NONE,
'Stream the audio file.'
->setCode(function (InputInterface $input, OutputInterface $output) {
$audioFile = $input->getArgument('audio-file');
$languageCode = $input->getOption('language-code');
transcribe_sync_words($audioFile, $languageCode, [
'encoding' => $input->getOption('encoding'),
'sampleRateHertz' => $input->getOption('sample-rate'),
]);
});

$application->add(new Command('transcribe-async'))
->setDefinition($inputDefinition)
->setDescription('Transcribe an audio file asynchronously using Google Cloud Speech API')
->setHelp(<<<EOF
The <info>%command.name%</info> command transcribes audio from a file using the
Google Cloud Speech API asynchronously.

<info>php %command.full_name% audio_file.wav</info>

EOF
)
->setCode(function (InputInterface $input, OutputInterface $output) {
$encoding = $input->getOption('encoding');
$audioFile = $input->getArgument('audio-file');
$languageCode = $input->getOption('language-code');
$sampleRate = $input->getOption('sample-rate');
transcribe_async($audioFile, $languageCode, [
'encoding' => $input->getOption('encoding'),
'sampleRateHertz' => $input->getOption('sample-rate'),
]);
});

$application->add(new Command('transcribe-async-gcs'))
->setDefinition($inputDefinition)
->setDescription('Transcribe audio asynchronously from a Storage Object using Google Cloud Speech API')
->setHelp(<<<EOF
The <info>%command.name%</info> command transcribes audio from a Cloud Storage
object asynchronously using the Google Cloud Speech API.

<info>php %command.full_name% gs://my-bucket/audio_file.wav</info>

EOF
)
->setCode(function (InputInterface $input, OutputInterface $output) {
$audioFile = $input->getArgument('audio-file');
$options = [
'encoding' => $encoding,
'languageCode' => $languageCode,
'sampleRateHertz' => $sampleRate,
];
if ($isGcs = preg_match('/^gs:\/\/([a-z0-9\._\-]+)\/(\S+)$/', $audioFile, $matches)) {
list($bucketName, $objectName) = array_slice($matches, 1);
}
if ($isGcs) {
if ($input->getOption('stream')) {
throw new LogicException('Cannot stream from a bucket!');
}
if ($input->getOption('async')) {
transcribe_async_gcs($bucketName, $objectName, $languageCode, $options);
} else {
transcribe_sync_gcs($bucketName, $objectName, $languageCode, $options);
}
} else {
if ($input->getOption('async')) {
transcribe_async($audioFile, $languageCode, $options);
} elseif ($input->getOption('stream')) {
$encodingInt = constant("Google\Cloud\Speech\V1\RecognitionConfig_AudioEncoding::$encoding");
streaming_recognize($audioFile, $languageCode, $encodingInt, $sampleRate);
} else {
transcribe_sync($audioFile, $languageCode, $options);
}
$languageCode = $input->getOption('language-code');
if (!preg_match('/^gs:\/\/([a-z0-9\._\-]+)\/(\S+)$/', $audioFile, $matches)) {
throw new \Exception('Invalid file name. Must be gs://[bucket]/[audiofile]');
}
list($bucketName, $objectName) = array_slice($matches, 1);
transcribe_async_gcs($bucketName, $objectName, $languageCode, [
'encoding' => $input->getOption('encoding'),
'sampleRateHertz' => $input->getOption('sample-rate'),
]);
});

$application->add(new Command('transcribe-async-words'))
->setDefinition($inputDefinition)
->setDescription('Transcribe an audio file asynchronously and print word time offsets using Google Cloud Speech API')
->setHelp(<<<EOF
The <info>%command.name%</info> command transcribes audio from a file using the
Google Cloud Speech API asynchronously and prints word time offsets.

<info>php %command.full_name% audio_file.wav</info>

EOF
)
->setCode(function (InputInterface $input, OutputInterface $output) {
$audioFile = $input->getArgument('audio-file');
$languageCode = $input->getOption('language-code');
transcribe_async_words($audioFile, $languageCode, [
'encoding' => $input->getOption('encoding'),
'sampleRateHertz' => $input->getOption('sample-rate'),
]);
});

$application->add(new Command('transcribe-stream'))
->setDefinition($inputDefinition)
->setDescription('Transcribe a stream of audio using Google Cloud Speech API')
->setHelp(<<<EOF
The <info>%command.name%</info> command transcribes audio from a stream using
the Google Cloud Speech API.

<info>php %command.full_name% audio_file.wav</info>

EOF
)
->setCode(function (InputInterface $input, OutputInterface $output) {
streaming_recognize(
$input->getArgument('audio-file'),
$input->getOption('language-code'),
$input->getOption('encoding'),
$input->getOption('sample-rate')
);
});

// for testing
Expand Down
5 changes: 4 additions & 1 deletion speech/src/streaming_recognize.php
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
use Google\Cloud\Speech\V1\RecognitionConfig;
use Google\Cloud\Speech\V1\StreamingRecognitionConfig;
use Google\Cloud\Speech\V1\StreamingRecognizeRequest;
use Google\Cloud\Speech\V1\RecognitionConfig_AudioEncoding;

/**
* Transcribe an audio file using Google Cloud Speech API
Expand Down Expand Up @@ -64,8 +65,10 @@ function streaming_recognize($audioFile, $languageCode, $encoding, $sampleRateHe
try {
$config = new RecognitionConfig();
$config->setLanguageCode($languageCode);
$config->setEncoding($encoding);
$config->setSampleRateHertz($sampleRateHertz);
// encoding must be an enum, convert from string
$encodingEnum = constant(RecognitionConfig_AudioEncoding::class . '::' . $encoding);
$config->setEncoding($encodingEnum);

$strmConfig = new StreamingRecognitionConfig();
$strmConfig->setConfig($config);
Expand Down
9 changes: 0 additions & 9 deletions speech/src/transcribe_async.php
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,6 @@ function transcribe_async($audioFile, $languageCode = 'en-US', $options = [])
'languageCode' => $languageCode,
]);

// When true, time offsets for every word will be included in the response.
$options['enableWordTimeOffsets'] = true;

// Create the asyncronous recognize operation
$operation = $speech->beginRecognizeOperation(
fopen($audioFile, 'r'),
Expand All @@ -74,12 +71,6 @@ function transcribe_async($audioFile, $languageCode = 'en-US', $options = [])
foreach ($alternatives as $alternative) {
printf('Transcript: %s' . PHP_EOL, $alternative['transcript']);
printf('Confidence: %s' . PHP_EOL, $alternative['confidence']);
foreach ($alternative['words'] as $wordInfo) {
printf(' Word: %s (start: %s, end: %s)' . PHP_EOL,
$wordInfo['word'],
$wordInfo['startTime'],
$wordInfo['endTime']);
}
}
}
}
Expand Down
9 changes: 0 additions & 9 deletions speech/src/transcribe_async_gcs.php
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,6 @@ function transcribe_async_gcs($bucketName, $objectName, $languageCode = 'en-US',
$storage = new StorageClient();
$object = $storage->bucket($bucketName)->object($objectName);

// When true, time offsets for every word will be included in the response.
$options['enableWordTimeOffsets'] = true;

// Create the asyncronous recognize operation
$operation = $speech->beginRecognizeOperation(
$object,
Expand All @@ -80,12 +77,6 @@ function transcribe_async_gcs($bucketName, $objectName, $languageCode = 'en-US',
foreach ($alternatives as $alternative) {
printf('Transcript: %s' . PHP_EOL, $alternative['transcript']);
printf('Confidence: %s' . PHP_EOL, $alternative['confidence']);
foreach ($alternative['words'] as $wordInfo) {
printf(' Word: %s (start: %s, end: %s)' . PHP_EOL,
$wordInfo['word'],
$wordInfo['startTime'],
$wordInfo['endTime']);
}
}
}
}
Expand Down
Loading