C¶
-
int
DS_CreateModel(const char *aModelPath, unsigned int aBeamWidth, ModelState **retval)¶ An object providing an interface to a trained DeepSpeech model.
- Return
Zero on success, non-zero on failure.
- Parameters
aModelPath: The path to the frozen model graph.aBeamWidth: The beam width used by the decoder. A larger beam width generates better results at the cost of decoding time.[out] retval: a ModelState pointer
-
void
DS_FreeModel(ModelState *ctx)¶ Frees associated resources and destroys model object.
-
int
DS_EnableDecoderWithLM(ModelState *aCtx, const char *aLMPath, const char *aTriePath, float aLMAlpha, float aLMBeta)¶ Enable decoding using beam scoring with a KenLM language model.
- Return
Zero on success, non-zero on failure (invalid arguments).
- Parameters
aCtx: The ModelState pointer for the model being changed.aLMPath: The path to the language model binary file.aTriePath: The path to the trie file build from the same vocabu- lary as the language model binary.aLMAlpha: The alpha hyperparameter of the CTC decoder. Language Model weight.aLMBeta: The beta hyperparameter of the CTC decoder. Word insertion weight.
-
int
DS_GetModelSampleRate(ModelState *aCtx)¶ Return the sample rate expected by a model.
- Return
Sample rate expected by the model for its input.
- Parameters
aCtx: A ModelState pointer created with DS_CreateModel.
-
char *
DS_SpeechToText(ModelState *aCtx, const short *aBuffer, unsigned int aBufferSize)¶ Use the DeepSpeech model to perform Speech-To-Text.
- Return
The STT result. The user is responsible for freeing the string using DS_FreeString(). Returns NULL on error.
- Parameters
aCtx: The ModelState pointer for the model to use.aBuffer: A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).aBufferSize: The number of samples in the audio signal.
-
Metadata *
DS_SpeechToTextWithMetadata(ModelState *aCtx, const short *aBuffer, unsigned int aBufferSize)¶ Use the DeepSpeech model to perform Speech-To-Text and output metadata about the results.
- Return
Outputs a struct of individual letters along with their timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.
- Parameters
aCtx: The ModelState pointer for the model to use.aBuffer: A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).aBufferSize: The number of samples in the audio signal.
-
int
DS_CreateStream(ModelState *aCtx, StreamingState **retval)¶ Create a new streaming inference state. The streaming state returned by this function can then be passed to DS_FeedAudioContent() and DS_FinishStream().
- Return
Zero for success, non-zero on failure.
- Parameters
aCtx: The ModelState pointer for the model to use.[out] retval: an opaque pointer that represents the streaming state. Can be NULL if an error occurs.
-
void
DS_FeedAudioContent(StreamingState *aSctx, const short *aBuffer, unsigned int aBufferSize)¶ Feed audio samples to an ongoing streaming inference.
- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().aBuffer: An array of 16-bit, mono raw audio samples at the appropriate sample rate (matching what the model was trained on).aBufferSize: The number of samples inaBuffer.
-
char *
DS_IntermediateDecode(StreamingState *aSctx)¶ Compute the intermediate decoding of an ongoing streaming inference.
- Return
The STT intermediate result. The user is responsible for freeing the string using DS_FreeString().
- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().
-
char *
DS_FinishStream(StreamingState *aSctx)¶ Signal the end of an audio signal to an ongoing streaming inference, returns the STT result over the whole audio signal.
- Return
The STT result. The user is responsible for freeing the string using DS_FreeString().
- Note
This method will free the state pointer (
aSctx).- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().
-
Metadata *
DS_FinishStreamWithMetadata(StreamingState *aSctx)¶ Signal the end of an audio signal to an ongoing streaming inference, returns per-letter metadata.
- Return
Outputs a struct of individual letters along with their timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.
- Note
This method will free the state pointer (
aSctx).- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().
-
void
DS_FreeStream(StreamingState *aSctx)¶ Destroy a streaming state without decoding the computed logits. This can be used if you no longer need the result of an ongoing streaming inference and don’t want to perform a costly decode operation.
- Note
This method will free the state pointer (
aSctx).- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().
-
void
DS_FreeString(char *str)¶ Free a char* string returned by the DeepSpeech API.
-
void
DS_PrintVersions()¶ Print version of this library and of the linked TensorFlow library.