首页 - Software - Main Content

Microsoft Open Sources the MarkItDown Project to Convert PDFs, Office Documents, Images, and Audio/Video to Markdown Format

Developers have a preference for writing in Markdown format, and now Microsoft has open-sourced a new project called MarkItDown, which can convert a wide range of content into Markdown format using AI.

For instance, conversions can be made from the following formats:

PDF
PowerPoint / PPTX
Excel / XLSX
Word / DOCX
Images / EXIF metadata and OCR
Audio / EXIF metadata and speech transcription
HTML / Special handling for Wikipedia and others
Other text-based formats like CSV, JSON, XML, etc.

For formats like images and audio that cannot be directly converted to text, AI can conveniently be used for tasks such as optical recognition using EXIF metadata and OCR for images, and AI for transcribing voice from audio to text.

So, what's the use of this project? Essentially, it helps developers convert a multitude of files in various formats into Markdown, facilitating subsequent indexing and text analysis. It indeed has practical applications.

The project is open-sourced under the MIT license. Developers interested can access the project here: https://github.com/microsoft/markitdown

Below is a simple guide on how to operate it:

You can install using pip: pip install markitdown

Or install from source: pip install -e .

The API usage is also very straightforward:

from markitdown import MarkItDown

markitdown = MarkItDown()
result = markitdown.convert("test.xlsx")
print(result.text_content)

It's also possible to describe images with large language models, in which case you'll need to provide the model client and parameters, etc.

from markitdown import MarkItDown
from openai import OpenAI

client = OpenAI()
md = MarkItDown(mlm_client=client, mlm_model="gpt-4o")
result = md.convert("example.jpg")
print(result.text_content)

Markdown(3)MarkItDown(1)Microsoft(213)MIT(1)open source software(53)

Copyright Notice:
Thank you for reading. This article was written by Landian News, and the author is Brook.X. If you wish to repost this article, please include a link to the original: https://landian.news/article/4540.html

{{userData.name}}

Microsoft Open Sources the MarkItDown Project to Convert PDFs, Office Documents, Images, and Audio/Video to Markdown Format

Microsoft Launches Office Add-in Development Toolkit (VS Code Edition) to Simplify Add-in Development for Developers

Windows 11 Build 26120.1252 (Dev Channel) Sandbox Unable to Start, Users Advised to Pause Updates

After an update, Microsoft Edge now tries to 'steal' data from Chrome by automatically opening webpages

FreeBSD Operating System Decides to Shorten Support Lifecycle from 5 Years to 4 Years to Ensure Stable Maintenance

QR Codes and Vector Emojis: Exploring Microsoft's Screenshot Tool Update

Mozilla Firefox v132.0.1 Official Release Fixes Intermittent Video Playback Stutter

Delta Airlines to Sue CrowdStrike and Microsoft Following a Massive Blue Screen Crash Resulting in 350 Million to 350 Million to 500 Million in Economic Loss

Microsoft Edge Steps Up Its Promotion Game: New Banner Ads on Windows 10 Spark Mixed Reactions

Windows 11 21H2 Build 22000.1879 Preview: A Glimpse of Key Updates Coming to Users in May

Open Source Virtual Machine Software VirtualBox 7.1.0 Beta Version Released, Featuring a Brand-New Modern UI Interface

[Download] Free Virtual Machine Software VMware Workstation Pro v17.6.2 Official Release - No Activation Required

MAS v2.9 Update: Your All-In-One Solution to Windows/Office Activation Issues

Microsoft Adds Red Hat's RHEL to the Official WSL Support, Allowing Corporate Customization

[Updated] The Importance of Data Backup: European Cloud Company Hetzner Deletes All Servers and Data of a Client Without Warning

Linux Kernel 6.13 to Support Display of Stuck Task Counts, Aiding Administrators in Fault Diagnosis

Ubuntu 20.04 LTS Support Nearing End: Upgrade or Subscribe to ESM for Updates

Elementary OS 8 Released: Aiming to Replace Windows and macOS

OpenAI Expands ChatGPT Collaboration: New IDE and Terminal Tool Integrations

Linux Kernel 6.13 First Release Candidate (RC1) Launched with Multiple New Features and Improvements