whisper.cpp/bindings/go/pkg/whisper/interface.go

package whisper

import (
	"io"
	"time"
)

///////////////////////////////////////////////////////////////////////////////
// TYPES

// SegmentCallback is the callback function for processing segments in real
// time. It is called during the Process function
type SegmentCallback func(Segment)

// ProgressCallback is the callback function for reporting progress during
// processing. It is called during the Process function
type ProgressCallback func(int)

// Model is the interface to a whisper model. Create a new model with the
// function whisper.New(string)
type Model interface {
	io.Closer

	// Return a new speech-to-text context.
	NewContext() (Context, error)

	// Return true if the model is multilingual.
	IsMultilingual() bool

	// Return all languages supported.
	Languages() []string
}

// Context is the speach recognition context.
type Context interface {
	SetLanguage(string) error // Set the language to use for speech recognition, use "auto" for auto detect language.
	SetTranslate(bool)        // Set translate flag
	IsMultilingual() bool     // Return true if the model is multilingual.
	Language() string         // Get language

	SetOffset(time.Duration)          // Set offset
	SetDuration(time.Duration)        // Set duration
	SetThreads(uint)                  // Set number of threads to use
	SetSplitOnWord(bool)              // Set split on word flag
	SetTokenThreshold(float32)        // Set timestamp token probability threshold
	SetTokenSumThreshold(float32)     // Set timestamp token sum probability threshold
	SetMaxSegmentLength(uint)         // Set max segment length in characters
	SetTokenTimestamps(bool)          // Set token timestamps flag
	SetMaxTokensPerSegment(uint)      // Set max tokens per segment (0 = no limit)
	SetAudioCtx(uint)                 // Set audio encoder context
	SetMaxContext(n int)              // Set maximum number of text context tokens to store
	SetBeamSize(n int)                // Set Beam Size
	SetEntropyThold(t float32)        // Set Entropy threshold
	SetInitialPrompt(prompt string)   // Set initial prompt
	SetTemperature(t float32)         // Set temperature
	SetTemperatureFallback(t float32) // Set temperature incrementation

	// Process mono audio data and return any errors.
	// If defined, newly generated segments are passed to the
	// callback function during processing.
	Process([]float32, SegmentCallback, ProgressCallback) error

	// After process is called, return segments until the end of the stream
	// is reached, when io.EOF is returned.
	NextSegment() (Segment, error)

	IsBEG(Token) bool          // Test for "begin" token
	IsSOT(Token) bool          // Test for "start of transcription" token
	IsEOT(Token) bool          // Test for "end of transcription" token
	IsPREV(Token) bool         // Test for "start of prev" token
	IsSOLM(Token) bool         // Test for "start of lm" token
	IsNOT(Token) bool          // Test for "No timestamps" token
	IsLANG(Token, string) bool // Test for token associated with a specific language
	IsText(Token) bool         // Test for text token

	// Timings
	PrintTimings()
	ResetTimings()

	SystemInfo() string
}

// Segment is the text result of a speech recognition.
type Segment struct {
	// Segment Number
	Num int

	// Time beginning and end timestamps for the segment.
	Start, End time.Duration

	// The text of the segment.
	Text string

	// The tokens of the segment.
	Tokens []Token
}

// Token is a text or special token
type Token struct {
	Id         int
	Text       string
	P          float32
	Start, End time.Duration
}
bindings : initial import of golang bindings (#287) * Initial import of golang bindings * Updated makefile rules * Updated bindings * Makefile update to add in more tests 2022-12-20 06:54:33 +00:00			`package whisper`

			`import (`
			`"io"`
			`"time"`
			`)`

			`///////////////////////////////////////////////////////////////////////////////`
			`// TYPES`

			`// SegmentCallback is the callback function for processing segments in real`
			`// time. It is called during the Process function`
			`type SegmentCallback func(Segment)`

go : improve progress reporting and callback handling (#1024) - Rename `cb` to `callNewSegment` in the `Process` function - Add `callProgress` as a new parameter to the `Process` function - Introduce `ProgressCallback` type for reporting progress during processing - Update `Whisper_full` function to include `progressCallback` parameter - Add `registerProgressCallback` function and `cbProgress` map for handling progress callbacks Signed-off-by: appleboy <appleboy.tw@gmail.com> 2023-06-25 11:07:55 +00:00			`// ProgressCallback is the callback function for reporting progress during`
			`// processing. It is called during the Process function`
			`type ProgressCallback func(int)`

bindings : initial import of golang bindings (#287) * Initial import of golang bindings * Updated makefile rules * Updated bindings * Makefile update to add in more tests 2022-12-20 06:54:33 +00:00			`// Model is the interface to a whisper model. Create a new model with the`
			`// function whisper.New(string)`
			`type Model interface {`
			`io.Closer`

			`// Return a new speech-to-text context.`
			`NewContext() (Context, error)`

go : adding features to the go-whisper example, go ci, etc (#384) * Updated bindings so they can be used in third pary packages. * Updated makefiles to set FMA flag on optionally, for xeon E5 on Darwin * Added test script * Changes for examples * Reverted * Made the NewContext method private 2023-01-07 19:21:43 +00:00			`// Return true if the model is multilingual.`
			`IsMultilingual() bool`

bindings : initial import of golang bindings (#287) * Initial import of golang bindings * Updated makefile rules * Updated bindings * Makefile update to add in more tests 2022-12-20 06:54:33 +00:00			`// Return all languages supported.`
			`Languages() []string`
			`}`

			`// Context is the speach recognition context.`
			`type Context interface {`
go : support "auto" as an option when set language (#462) Co-authored-by: Ming <ming@localhost> 2023-02-04 07:09:27 +00:00			`SetLanguage(string) error // Set the language to use for speech recognition, use "auto" for auto detect language.`
go : adding features to the go-whisper example, go ci, etc (#384) * Updated bindings so they can be used in third pary packages. * Updated makefiles to set FMA flag on optionally, for xeon E5 on Darwin * Added test script * Changes for examples * Reverted * Made the NewContext method private 2023-01-07 19:21:43 +00:00			`SetTranslate(bool) // Set translate flag`
			`IsMultilingual() bool // Return true if the model is multilingual.`
bindings : initial import of golang bindings (#287) * Initial import of golang bindings * Updated makefile rules * Updated bindings * Makefile update to add in more tests 2022-12-20 06:54:33 +00:00			`Language() string // Get language`
go : adding features to the go-whisper example, go ci, etc (#384) * Updated bindings so they can be used in third pary packages. * Updated makefiles to set FMA flag on optionally, for xeon E5 on Darwin * Added test script * Changes for examples * Reverted * Made the NewContext method private 2023-01-07 19:21:43 +00:00
go : add temperature options (#2417) * Fixed go cuda bindings building * Added note to go bindings Readme to build using cuda support * Added temperature bindings for Go --------- Co-authored-by: Binozo <entwickler@binozoworks.de> 2024-09-20 12:45:36 +00:00			`SetOffset(time.Duration) // Set offset`
			`SetDuration(time.Duration) // Set duration`
			`SetThreads(uint) // Set number of threads to use`
			`SetSplitOnWord(bool) // Set split on word flag`
			`SetTokenThreshold(float32) // Set timestamp token probability threshold`
			`SetTokenSumThreshold(float32) // Set timestamp token sum probability threshold`
			`SetMaxSegmentLength(uint) // Set max segment length in characters`
			`SetTokenTimestamps(bool) // Set token timestamps flag`
			`SetMaxTokensPerSegment(uint) // Set max tokens per segment (0 = no limit)`
			`SetAudioCtx(uint) // Set audio encoder context`
			`SetMaxContext(n int) // Set maximum number of text context tokens to store`
			`SetBeamSize(n int) // Set Beam Size`
			`SetEntropyThold(t float32) // Set Entropy threshold`
			`SetInitialPrompt(prompt string) // Set initial prompt`
			`SetTemperature(t float32) // Set temperature`
			`SetTemperatureFallback(t float32) // Set temperature incrementation`
bindings : initial import of golang bindings (#287) * Initial import of golang bindings * Updated makefile rules * Updated bindings * Makefile update to add in more tests 2022-12-20 06:54:33 +00:00
			`// Process mono audio data and return any errors.`
			`// If defined, newly generated segments are passed to the`
			`// callback function during processing.`
go : improve progress reporting and callback handling (#1024) - Rename `cb` to `callNewSegment` in the `Process` function - Add `callProgress` as a new parameter to the `Process` function - Introduce `ProgressCallback` type for reporting progress during processing - Update `Whisper_full` function to include `progressCallback` parameter - Add `registerProgressCallback` function and `cbProgress` map for handling progress callbacks Signed-off-by: appleboy <appleboy.tw@gmail.com> 2023-06-25 11:07:55 +00:00			`Process([]float32, SegmentCallback, ProgressCallback) error`
bindings : initial import of golang bindings (#287) * Initial import of golang bindings * Updated makefile rules * Updated bindings * Makefile update to add in more tests 2022-12-20 06:54:33 +00:00
			`// After process is called, return segments until the end of the stream`
			`// is reached, when io.EOF is returned.`
			`NextSegment() (Segment, error)`
go : adding features to the go-whisper example, go ci, etc (#384) * Updated bindings so they can be used in third pary packages. * Updated makefiles to set FMA flag on optionally, for xeon E5 on Darwin * Added test script * Changes for examples * Reverted * Made the NewContext method private 2023-01-07 19:21:43 +00:00
			`IsBEG(Token) bool // Test for "begin" token`
			`IsSOT(Token) bool // Test for "start of transcription" token`
			`IsEOT(Token) bool // Test for "end of transcription" token`
			`IsPREV(Token) bool // Test for "start of prev" token`
			`IsSOLM(Token) bool // Test for "start of lm" token`
			`IsNOT(Token) bool // Test for "No timestamps" token`
			`IsLANG(Token, string) bool // Test for token associated with a specific language`
			`IsText(Token) bool // Test for text token`
go : added wrappers to reset and print timings (#436) 2023-01-25 16:57:30 +00:00
go : add wrapper for system info (#456) 2023-01-28 16:44:56 +00:00			`// Timings`
go : added wrappers to reset and print timings (#436) 2023-01-25 16:57:30 +00:00			`PrintTimings()`
			`ResetTimings()`
go : add wrapper for system info (#456) 2023-01-28 16:44:56 +00:00
			`SystemInfo() string`
bindings : initial import of golang bindings (#287) * Initial import of golang bindings * Updated makefile rules * Updated bindings * Makefile update to add in more tests 2022-12-20 06:54:33 +00:00			`}`

			`// Segment is the text result of a speech recognition.`
			`type Segment struct {`
			`// Segment Number`
			`Num int`

			`// Time beginning and end timestamps for the segment.`
			`Start, End time.Duration`

			`// The text of the segment.`
			`Text string`

			`// The tokens of the segment.`
			`Tokens []Token`
			`}`

			`// Token is a text or special token`
			`type Token struct {`
go : exposed various parts to the Go Interface (#697) 2023-04-14 15:52:10 +00:00			`Id int`
			`Text string`
			`P float32`
			`Start, End time.Duration`
bindings : initial import of golang bindings (#287) * Initial import of golang bindings * Updated makefile rules * Updated bindings * Makefile update to add in more tests 2022-12-20 06:54:33 +00:00			`}`