Text processing

Automatic transcriptions

Automatic transcription workflows.

Cuéllar, Álvaro. (2023). «La Inteligencia Artificial al rescate del Siglo de Oro. Transcripción y modernización automática de mil trescientos impresos y manuscritos teatrales» . Hipogrifo. Revista de literatura y cultura del Siglo de Oro, vol. 11, núm. 1, pp. 101-115.

We have recently developed automatic transcription workflows using Transkribus. These workflows have allowed us to automatically transcribe and modernize the spelling of around 1,000 printed books and 350 manuscripts of Golden Age theatre, which are now part of CETSO and TEXORO.

1,000

printed books automatically transcribed and spelling-modernized

350

manuscripts incorporated into the project workflows

99%

approximate accuracy for printed books

90%

approximate accuracy for manuscripts

The three models used are public, and anyone can use them through Transkribus.

Transkribus, 2021

Spanish Golden Age Prints 1.0

Model trained for automatic transcription of Golden Age theatrical printed books.

Transkribus, 2021

Spanish Golden Age Prints (Spelling Modernization) 1.0

Version designed for automatic spelling modernization of already transcribed printed books.

Transkribus, 2021

Spanish Golden Age Manuscripts (Spelling Modernization) 1.0

Model focused on theatrical manuscripts, with spelling modernization and detection of relevant features.

These models allow us to transcribe theatrical printed books and manuscripts with a high degree of accuracy: approximately 99% accuracy for printed books and 90% for manuscripts. Our transcriptions can also automatically modernize spelling according to current standards and detect certain elements, such as italics.

Example of automatic transcription applied to a Golden Age theatrical text. Second example of automatic transcription and spelling modernization.

If you would like to learn more about the tool, apply our transcription models to your documents, or request a specific transcription of a printed book or manuscript for research, please contact Álvaro Cuéllar.