Tech News

faster whisper从多媒体语音材料中抽取出文本-3

"""
批量转录当前目录下的 .mp3 文件，使用 faster-whisper
"""

import os
import logging
import sys
from pathlib import Path
from faster_whisper import WhisperModel

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger('Extrat Text')

# ================== 配置区 ==================
MODEL_SIZE = "small"          # 可选: tiny, base, small, medium, large
DEVICE = "cpu"                # cpu 或 cuda
COMPUTE_TYPE = "int8"         # int8, float16, float32 (CPU 推荐 int8)
VAD_FILTER = True             # 启用语音活动检测，去除静音
OUTPUT_FORMAT = "txt"         # 只输出 .txt
VERBOSE = True               # 是否显示详细日志
# ===========================================

def transcribe_audio(audio_path: Path, model: WhisperModel) -> str:
    """转录单个音频文件，返回文本内容"""
    print(f"转录: {audio_path.name} → {audio_path.stem}.txt")

    segments, info = model.transcribe(
        str(audio_path),
        language=None,           # 自动检测
        beam_size=5,
        vad_filter=VAD_FILTER,
        vad_parameters=dict(min_silence_duration_ms=500),
        word_timestamps=False,
    )

    text_lines = []
    for segment in segments:
        line = segment.text.strip()
        text_lines.append(line)
        if VERBOSE:
            logger.info(".", end="", flush=True)
            # print(f"[{segment.start:06.2f}s --> {segment.end:06.2f}s] {line}", flush=True)

    return "n".join(text_lines)


def main():
    print("=== faster-whisper 批量转录 ===")

    current_dir = Path(".")
    mp3_files = sorted(current_dir.glob("*.mp3"))

    if not mp3_files:
        print("未找到 .mp3 文件，退出。")
        return

    # 加载模型（只加载一次）
    print(f"正在加载模型 {MODEL_SIZE} ({DEVICE}, {COMPUTE_TYPE})...")
    model = WhisperModel(MODEL_SIZE, device=DEVICE, compute_type=COMPUTE_TYPE)

    processed = 0
    for mp3_path in mp3_files:
        txt_path = mp3_path.with_suffix(".txt")

        if txt_path.exists():
            print(f"跳过: {txt_path.name} 已存在")
            continue

        try:
            text = transcribe_audio(mp3_path, model)
            txt_path.write_text(text, encoding="utf-8")
            processed += 1
        except Exception as e:
            print(f"错误转录 {mp3_path.name}: {e}", file=sys.stderr)

    print(f"全部完成！共处理 {processed} 个文件。")


if __name__ == "__main__":
    main()

🎬 Watch the Video

Tech News

The Journey of Wowzers: Creativity, Challenges, and Our Future
ByAdil 13/11/2025

When we first imagined Wowzers, we wanted to create more than just a party entertainment service. Our vision was to build a platform where creativity, professionalism, and joy converge — a place where every face-painted smile, balloon sculpture, and caricature drawing tells a story and leaves a lasting memory. Wowzers was born from a simple…

Read More The Journey of Wowzers: Creativity, Challenges, and Our Future
Tech News

🧠 How User-Space and Kernel-Space Affect Security in Linux
ByAdil 13/11/2025

When we talk about Linux security, most people think of firewalls, sudo permissions, or file ownership. But the real security story begins much deeper — at the boundary between user-space and kernel-space. Understanding this boundary is essential for anyone working in cybersecurity, system hardening, or malware analysis. Because once this barrier is broken… you no…

Read More 🧠 How User-Space and Kernel-Space Affect Security in Linux
Tech News

What It Takes to Build a Modern Social Network in Django
ByAdil 13/11/2025

Over six months, I developed a complete social networking platform using Django. The project, with over 1,000 commits, implements a comprehensive set of features required for a modern social network. It was successfully deployed on a home server, confirming its stability. Legal Disclaimer & Warning ⚠️ This project, its code, and associated documentation are provided…

Read More What It Takes to Build a Modern Social Network in Django
Tech News

Windows 11 users rebel as top Microsoft exec says operating system is ‘evolving into an agentic OS’
ByAdil 13/11/2025

Windows 11 is going ‘agentic’ and ‘AI-native’ and a whole heap of other buzzwords that most consumers really don’t care for. 🎬 Watch the Video

Read More Windows 11 users rebel as top Microsoft exec says operating system is ‘evolving into an agentic OS’
Tech News

What is the release date for Landman season 2 episode 1 on Paramount+?
ByAdil 13/11/2025

We’re all ready to see Tommy wage war on the world in Landman season 2 – but when is episode 1 released? 🎬 Watch the Video

Read More What is the release date for Landman season 2 episode 1 on Paramount+?
Tech News

Your Samsung TV just got a personality – and it knows what you’re watching, what you need, and when to talk
ByAdil 13/11/2025

Samsung’s new smart TVs include the Vision AI Companion, a voice-activated AI assistant powered by Bixby, Microsoft Copilot, and Perplexity. 🎬 Watch the Video

Read More Your Samsung TV just got a personality – and it knows what you’re watching, what you need, and when to talk

Similar Posts