For approximately ten years, Texas Instruments has been developing solid-state speech technology that produces speech which accurately captures the characteristics of the spoken voice, including intonation, accent, dialect, and pitch. When connected to a microcomputer, it enables the formation of complete phrases and sentences, facilitating voice communication between the computer and the user. The applications of this technology are extensive and beneficial to a wide range of users. The carefully curated word library offers various applications in home and industrial settings, including telephone systems, burglar alarms, conversational interfaces, messaging, gaming, electronic notifications, studio control, speaking clocks, temperature indications, calendars, business coding, factory announcements, and accounting. This month, a complete building project will be presented, available for purchase as a kit, along with guidance on interfacing the WORDMAKER board to a microcomputer. Possible interface circuits will be included, along with BASIC programming examples. The system has been thoroughly tested on the Sharp MZ-80K and Tangerine systems, with further details available for other popular microcomputers. Additional information regarding the speech synthesis processes utilized will follow, and contributions from readers regarding interfacing and application ideas are encouraged. The E and MM WORDMAKER Speech Synthesizer is based on the Texas Instruments Voice Synthesis Processor (VSP). This card can be connected to any computer system or function as an independent unit. It consists of the Texas TMS 5100 Voice Synthesis Processor, a memory bank containing the vocabulary, and an onboard amplifier. The synthesis method employed is Linear Predictive Coding (LPC), a technique developed by Texas Instruments that reduces the storage requirements for each word. Human speech, like most communication signals, contains a significant amount of redundant information. LPC analyzes the entire word as a binary data string and eliminates redundant data. The coding is then evaluated to ensure satisfactory speech reproduction. The TMS 5100 features a 10-pole digital filter that synthesizes the voice, controlled by the LPC data. The length of the data string written to the TMS 5100 for each word sample can range from 4 to 49 bits, necessitating a relatively high level of processing capability. The TMS 5100 has five control lines. The command is configured on the CTL lines and executed by toggling the command clock line (PDC). The complete list of commands is provided in Table 1, and the pin configuration of the IC is illustrated in Figure 1.
Load Address Command: This command prompts the VSP to receive a subsequent nibble (4 bits) of data configured on the CTL lines as a speech address segment, which is then transferred to the voice synthesis (V/S) ROM address registers. Read and Branch: This command instructs the VSP to generate the necessary control signals for the V/S ROM, allowing it to update its address registers with the contents of the currently addressed byte pair. Speak: Upon receiving this command, the VSP takes control of the V/S ROM, generating pulses on its I/O line to retrieve bit-serial data from the ROM and initiate speech production. Pulses on the I/O line occur in bursts at a frame interval of 25 milliseconds, with the number of pulses per frame varying from 4 to 49, depending on the data. The timing of I/O pulses for a maximum length of 49 bits is depicted in Figure 2. Further details regarding the data structure will be addressed in a subsequent article. Test Busy: This command allows the controller to access the TALK STATUS LATCH of the VSP. In operation, the command is first configured on the CTL lines, followed by a toggle of the PDC line. A subsequent toggle of the PDC line enables the Talk Status to be output to the CTL1 line. The Talk Status remains high during speech generation and is set low upon encountering an END or PHRASE code.For some ten years Texas Instruments have been developing solid state speech technology with the result that speech can now be produced which faithfully preserves the character of the spoken voice including intonation, accent, dialect, and pitch. Linked to a microcomputer, words can be strung together to make complete phrases and sentences so that
voice communication between `computer` and human becomes possible. The uses of this project are far reaching and will be of benefit to almost anyone who uses it. The carefully selected word library has many applications in the home and industry, for telephone, burglar alarms, conversations, messages, games, electronic terms, studio control speaking clock, temperature indication, calendar, business coding, factory announcements, and accountancy. This month we shall present the complete building project which can be purchased as a kit and explain how to interface the WORDMAKER board to a microcomputer.
Possible interface circuits are included and BASIC programs are also given. It has already been fully tested on the Sharp MZ-80K and Tangerine systems. Further details are provided later for other popular micros and we shall be following this article with additional information on the processes of speech synthesis employed, and readers` ideas for interfacing and use will be welcomed. The E and MM WORDMAKER Speech Synthesiser is based on the Texas Instruments Voice Synthesis Processor (VSP).
This card can be interfaced to any computer system or used as an independent unit. The card comprises the Texas TMS 5100 Voice Synthesis Processor; a memory bank containing the vocabulary and an onboard amplifier. The synthesis method used is called Linear Predictive Coding (LPC). This is a technique developed by Texas which minimises the amount of storage needed for each word. Human speech, like most communication signals, contains a large proportion of redundant information. LPC involves looking at the complete word as a binary data string and removing any redundant data. The coding is then tested to check that the word is spoken satisfactorily. The TMS 5100 contains a 10pole digital filter which synthesises the voice; the filter is controlled by the LPC data.
For each word sample, the length of the data string written to the TMS 5100 may vary from 4 to 49 bits. The device, therefore, requires quite a high level of `intelligence`. The TMS 5100 has five control lines. The command is set up on the CTL lines and executed by toggling the command clock line, PDC. Table 1 shows the complete list of commands and Figure 1 gives the pin configuration of the IC. Load Address Command: This command causes the VSP to accept a subsequent nibble (4-bits) of data set up on CTL lines as a speech address segment which is transferred to voice synthesis (V/S) ROM address registers.
Read and Branch: This instructs the VSP to set up appropriate control signals to the V/S ROM, causing it to update its address registers with the contents of the currently addressed pair of bytes. Speak: On receiving this command, the VSP takes over the control of the V/S ROM and generates pulses on its l/O line to fetch bit serial data from ROM and commences speech.
Pulses on the I/O line occur in bursts of a frame interval of twentyfive milliseconds. The number of pulses in any one frame varies from 4 to 49, depending on the data. The timing of I/O pulses for a maximum length of 49 bits, is shown in Figure 2. Details of the data structure will be discussed in a future article. Test Busy: This command permits the controller to access the TALK STATUS LATCH of the VSR In operation the command is first set up on CTL lines and the PDC line toggled once. A subsequent toggle of the PDC line enables the Talk Status to be output to CTL1 line. The Talk Status will be high during the execution of speech generation and will be set low on an END or PHRASE code being` encountered.
The speech synthesis circuit incorporates a microprocessor series, LCD drivers, a clock oscillator, input and output ports, memory, and a multi-voice audio signal amplifier circuit unit. This series circuit is primarily utilized in voice clocks, thermometers, electronic calendars, and...
The XR-T5995 Speech Network is a monolithic integrated circuit specifically designed for implementing a low-cost telephone circuit. It is designed to use an electrodynamic microphone and electromagnetic receiver to replace a carbon microphone and telephone network hybrid.
The XR-T5995 Speech...
This board does not utilize the CTS256AL2 text-to-speech chip. However, both SpeechChips.com and JDR may still have some of these chips available for purchase. The board can accommodate the SPO chip and an amplifier, allowing it to connect to...
The TEA5711 is a high-performance Bimos integrated circuit designed for use in AM/FM stereo radios. It integrates all necessary functions, including the AM and FM front-end, AM detector, and FM stereo output stages.
The TEA5711 is engineered to provide a...
The circuit is based on the PIC 16F877 microcontroller, as illustrated in figure 4. The digital display output of the speech recognition circuit connects directly to the pins of the microcontroller. An LED provides a trigger that informs the...
This circuit utilizes digital techniques to implement the frequency-inversion algorithm by digitizing the audio, inverting the sign of every alternate sample, and performing D/A conversion on the resulting data. The outcome is an inverted frequency spectrum. Additionally, the circuit...
We use cookies to enhance your experience, analyze traffic, and serve personalized ads.
By clicking "Accept", you agree to our use of cookies.
Learn more