Convert Text-to-Speech with Python
Text has always been one of the most natural ways humans communicate with machines. We write messages, create documents, send emails, and store ideas in words. But sometimes reading is not the best format. Maybe your eyes are tired, maybe youโre multitasking, maybe you want to learn while walking, or maybe accessibility matters. That is where Text-to-Speech becomes incredibly useful.
With Text-to-Speech, written content transforms into spoken audio. A paragraph becomes a voice. A blog post becomes a podcast-style file. Notes become something you can listen to while driving. In many ways, it changes how we interact with information.
Python is one of the best languages for building Text-to-Speech tools because it is simple, powerful, and supported by many excellent libraries. Whether you want offline speech generation, cloud-quality voices, or automated batch conversion, Python gives you practical options.
In this complete guide, we will explore how to convert text to speech with Python, step by step. We will start with beginner-friendly examples and then move into advanced workflows, audio exports, APIs, desktop tools, and real-world automation.
What Does Convert Text-to-Speech Mean?
Convert Text-to-Speech means taking written text and generating spoken audio from it.
Example:
Input text:
Welcome to my Python project.
Output:
๐ง An audio voice saying:
โWelcome to my Python project.โ
That output may be:
Played instantly through speakers
Saved as MP3 or WAV
Used inside an app
Sent to users
Combined with videos
Used for accessibility tools
Why Use Python for Text-to-Speech?
Python is a strong choice because it offers:
Easy syntax
Fast development speed
Great libraries
Strong automation support
API integrations
Cross-platform support
You can build small scripts in minutes or full production systems later.
Common Real-World Uses
Text-to-Speech is more useful than many people first assume.
Accessibility
Help visually impaired users hear text.
Learning Tools
Read lessons, articles, or vocabulary aloud.
Content Creation
Convert blogs into audio versions.
Smart Assistants
Build voice-enabled bots.
Notifications
Speak alerts, reminders, or status updates.
Productivity
Listen to notes while multitasking.
Best Python Libraries for Text-to-Speech
Several libraries exist. Each has strengths.
Library | Works Offline | Easy to Use | Natural Voices | Internet Required |
|---|---|---|---|---|
pyttsx3 | Yes | Yes | Medium | No |
gTTS | No | Very Easy | Good | Yes |
edge-tts | No | Easy | Very Good | Yes |
Coqui TTS | Optional | Medium | Excellent | Sometimes |
Azure / Google APIs | No | Medium | Premium | Yes |
For beginners, start with:
pyttsx3for offlinegTTSfor easy MP3 exportedge-ttsfor natural free voices
Method 1: Convert Text-to-Speech Offline with pyttsx3
Install
pip install pyttsx3
Basic Example
import pyttsx3
engine = pyttsx3.init()
engine.say("Hello Hassan, welcome to Python text to speech.")
engine.runAndWait()
Your computer speaks instantly.
Change Voice Speed
import pyttsx3
engine = pyttsx3.init()
engine.setProperty("rate", 160)
engine.say("This speech is slower and easier to hear.")
engine.runAndWait()
Change Volume
engine.setProperty("volume", 1.0)
Range:
0.0= mute1.0= full volume
Change Voice
voices = engine.getProperty("voices")
for voice in voices:
print(voice.id)
Then choose one:
engine.setProperty("voice", voices[0].id)
Save to File
import pyttsx3
engine = pyttsx3.init()
engine.save_to_file(
"This file was created using Python text to speech.",
"output.wav"
)
engine.runAndWait()
Method 2: Convert Text-to-Speech with gTTS
Google Text-to-Speech is simple and popular.
Install
pip install gTTS
Example
from gtts import gTTS
text = "Welcome to convert text to speech with Python."
tts = gTTS(text=text, lang="en")
tts.save("voice.mp3")
Now you have an MP3 file.
Play the File
Windows:
import os
os.system("start voice.mp3")
Linux:
os.system("xdg-open voice.mp3")
Mac:
os.system("open voice.mp3")
Multi-language Support
from gtts import gTTS
tts = gTTS("ู
ุฑุญุจุง ุจู ูู ู
ุดุฑูุน ุจุงูุซูู", lang="ar")
tts.save("arabic.mp3")
Supported examples:
English
enArabic
arFrench
frSpanish
es
Method 3: Better Voices with edge-tts
This is one of the best free options.
Install
pip install edge-tts
Example
import asyncio
import edge_tts
async def main():
communicate = edge_tts.Communicate(
"Welcome to modern text to speech with Python.",
voice="en-US-AriaNeural"
)
await communicate.save("modern.mp3")
asyncio.run(main())
The voice quality is excellent.
Popular Voices
en-US-AriaNeuralen-US-GuyNeuralen-GB-SoniaNeuralfr-FR-DeniseNeuralar-SA-ZariyahNeural
Convert Text File to Speech
Many users want to convert .txt documents.
Example
from gtts import gTTS
with open("story.txt", "r", encoding="utf-8") as file:
text = file.read()
tts = gTTS(text=text, lang="en")
tts.save("story.mp3")
This turns a text file into spoken audio.
Convert PDF to Speech with Python
Install PDF reader:
pip install PyPDF2
Example
import PyPDF2
from gtts import gTTS
text = ""
with open("book.pdf", "rb") as file:
reader = PyPDF2.PdfReader(file)
for page in reader.pages:
text += page.extract_text()
tts = gTTS(text=text[:5000], lang="en")
tts.save("book.mp3")
Now your PDF becomes audio.
Convert Word Document to Speech
Install:
pip install python-docx
Example
from docx import Document
from gtts import gTTS
doc = Document("notes.docx")
text = "\n".join([p.text for p in doc.paragraphs])
tts = gTTS(text=text, lang="en")
tts.save("notes.mp3")
Build a Command-Line TTS Tool
from gtts import gTTS
text = input("Enter text: ")
tts = gTTS(text=text, lang="en")
tts.save("result.mp3")
print("Audio created.")
Build a Flask API for Text-to-Speech
Install:
pip install flask gtts
API Example
from flask import Flask, request, send_file
from gtts import gTTS
app = Flask(__name__)
@app.route("/tts", methods=["POST"])
def tts():
text = request.json["text"]
speech = gTTS(text=text, lang="en")
speech.save("output.mp3")
return send_file("output.mp3")
app.run(debug=True)
Request:
{
"text": "Hello from Flask API"
}
Batch Convert Many Files
import os
from gtts import gTTS
folder = "texts"
for filename in os.listdir(folder):
if filename.endswith(".txt"):
with open(os.path.join(folder, filename), "r", encoding="utf-8") as f:
text = f.read()
out = filename.replace(".txt", ".mp3")
gTTS(text=text).save(out)
print("Done.")
Add Human Emotion Through Punctuation
Good text sounds better when written naturally.
Instead of:
Hello welcome today we learn python
Use:
Hello! Welcome. Today, we learn Python.
Voices pause more naturally.
Create Podcast Style Narration
script = """
Welcome back to our weekly tech update.
Today we explore Python automation.
Let's begin.
"""
Then convert to speech.
This makes blog-to-audio content easy.
Convert Long Text Properly
Some services limit text length. Split into chunks.
def split_text(text, size=3000):
return [text[i:i+size] for i in range(0, len(text), size)]
Then process chunk by chunk.
Merge Audio Files Later
Use pydub.
pip install pydub
from pydub import AudioSegment
a = AudioSegment.from_mp3("1.mp3")
b = AudioSegment.from_mp3("2.mp3")
final = a + b
final.export("full.mp3", format="mp3")
Add Background Music
voice = AudioSegment.from_mp3("voice.mp3")
music = AudioSegment.from_mp3("music.mp3") - 20
mixed = music.overlay(voice)
mixed.export("podcast.mp3", format="mp3")
Convert Arabic Text to Speech
from gtts import gTTS
text = "ู
ุฑุญุจุง ุจู ูู ู
ุดุฑูุน ุชุญููู ุงููุต ุฅูู ููุงู
"
tts = gTTS(text=text, lang="ar")
tts.save("arabic.mp3")
Useful for Moroccan, Arabic, and multilingual tools.
GUI App with Tkinter
import tkinter as tk
from gtts import gTTS
def convert():
text = box.get("1.0", "end")
gTTS(text=text).save("gui.mp3")
app = tk.Tk()
box = tk.Text(app, height=10, width=50)
box.pack()
btn = tk.Button(app, text="Convert", command=convert)
btn.pack()
app.mainloop()
Common Problems
Voice Sounds Robotic
Use:
edge-tts
Azure voices
Google Cloud voices
Arabic Characters Broken
Use UTF-8:
open("file.txt", "r", encoding="utf-8")
Large File Fails
Split into chunks.
No Sound with pyttsx3
Check system audio drivers.
Best Tool by Use Case
Need | Best Choice |
|---|---|
Offline local tool | pyttsx3 |
Fast MP3 export | gTTS |
Natural free voices | edge-tts |
Commercial quality | Azure / Google |
Open-source AI | Coqui TTS |
Real Project Ideas
1. Blog to Audio Website
Convert articles to MP3 automatically.
2. Reading Assistant
Paste text and listen instantly.
3. PDF Audiobook Generator
Convert books to chapters.
4. Language Practice App
Hear pronunciation.
5. Accessibility Reader
Read websites aloud.
Folder Structure Example
tts_project/
โโโ app.py
โโโ input/
โโโ output/
โโโ templates/
โโโ static/
โโโ requirements.txt
requirements.txt
flask
gtts
edge-tts
pyttsx3
pydub
python-docx
PyPDF2
Performance Tips
Cache generated files
Reuse repeated speech
Use async for many requests
Compress large MP3 files
Queue batch jobs
Security Tips for APIs
If users send text:
Limit max length
Clean dangerous input
Add rate limits
Use temp folders
Delete old files
Human Advice from Experience
Many developers start TTS projects thinking itโs just โconvert text and done.โ Then they discover the real magic is quality:
natural pauses
voice selection
sentence formatting
chunking long text
multilingual support
speed control
That is what separates a toy project from something users love.
Full Beginner Script
from gtts import gTTS
import os
text = input("Enter text: ")
tts = gTTS(text=text, lang="en")
tts.save("speech.mp3")
os.system("start speech.mp3")
Final Thoughts
Convert Text-to-Speech with Python is one of the most rewarding beginner-to-advanced projects you can build. It starts with a few lines of code, but quickly opens doors to accessibility apps, voice assistants, educational tools, content automation, and modern user experiences.
Python gives you the freedom to start simple with gTTS, go offline with pyttsx3, or achieve high-quality voices using edge-tts and cloud APIs.
Sometimes the most powerful projects are the ones that literally give your software a voice.