Coding Across Languages: The Multilingual Programming Paradigm

This article explores the world of multilingual programming, diving deep into its intricacies and the challenges it poses. It emphasizes the necessity to move beyond translation and incorporate natural language elements within coding environments. I advocate for Unicode support and flexible syntax, while also discussing the importance of considering different cultures. Furthermore, this article sheds light on the significance of native language support and comprehensive documentation as crucial components of creating an inclusive and enriching multilingual programming experience.

Introduction

There are two distinct interpretations of multilingual programming. One perspective revolves around a programmer claiming expertise in multilingual programming due to their proficiency in one or more programming languages. The other viewpoint focuses on the ability to code in multiple natural or human languages. In this article, our primary emphasis will be on the latter aspect of multilingual programming.

Multilingual programming encompasses providing developers with languages and tools that facilitate development, documentation, testing, and maintenance in multiple languages. A truly multilingual programming language must be universal, devoid of any specific human language association, and capable of supporting human languages, including constructed ones.

The debate over whether it's more valuable to possess the ability to program in multiple programming languages or to have a strong grasp of programming fundamentals and an awareness of programming paradigms is a topic worth exploring. Nevertheless, being well-acquainted with the programming language used in a project can significantly expedite the project's commencement due to the ease of understanding its syntax. This debate can also extend to human languages, as a programmer should ideally have a development environment that allows them to code in their native language without the need for mental gymnastics to translate function names and objectives.

Multilingual

Many programming languages predominantly target English speakers [1,2,3,4], which can pose significant challenges for non-English speakers. Programming keywords often rely on the English (Latin) alphabet and are frequently derived from English words. For instance, when defining a function in a language like Python, one must use the keyword def, which is derived from "define." Similarly, the keyword for defining a class is the English word "class." Now, imagine a non-English speaker embarking on their programming journey. Unlike mathematics, they must acquaint themselves with these English or English-like keywords, which can be particularly daunting for those whose native language doesn't use the English alphabet. Some may argue that many developers have at least a basic understanding of English, which can help them quickly recall command abbreviations like ls for listing files or mkdir for creating directories. However, this language barrier still presents a significant challenge.

Beyond translation

Currently, multilingual programming support primarily centers around translating input and output messages. While this is a crucial step toward making software more accessible to a global audience, it's essential to broaden our perspective on multilingual programming. It's not just about translation; it's about creating a genuinely inclusive coding environment that transcends language barriers.

Let's consider the bigger picture - development in a programming language entails more than just grasping its syntax and semantics. What about error messages? When a program encounters an issue, these messages offer crucial feedback to both developers and users. They must be presented in a language that users can comprehend. Additionally, think about logging, which plays a pivotal role in troubleshooting and debugging. Ensuring that log entries are multilingual is imperative for effective error diagnosis. Furthermore, the documentation of functions and classes in the codebase should be accessible across languages, facilitating code readability and collaboration. In essence, multilingual programming should encompass not only message translation but also a thoughtful approach to error messages, logging, and more.

When we dive into multilingual programming, we also need to take into account user interfaces, configurations in XML/JSON or other formats, command-line options, and unit measurements, among other aspects. Most systems provide support for multiple locales. For instance, the locale command in Linux systems offers valuable insights into the available locales on a given machine, aiding in software localization. However, our current translation approach appears somewhat outdated and cumbersome. It might give the impression that it was developed during a time when most programmers were English speakers.

The commonly used translation methodology involves the use of gettext [5] and .po files. A parser scans all programming code, extracting messages and saving them in .po files, which are then translated into different languages. When a user starts an application, the appropriate translated messages are used based on the locale. But is there another way to envision this process? What approach could we take if a programming language is inherently multilingual? Would it involve more than mere translation? Could it entail the use of numerical identifiers?

Unicode

To achieve true multilingual programming, robust Unicode support [6,7] and flexible language syntax [8,9,10] are essential. The syntax of a programming language should possess the necessary flexibility to accommodate diverse language structures and grammar rules. It must also be capable of handling various character encodings, seamlessly integrating non-ASCII characters within identifiers, comments, and string literals.

Comprehensive multilingual programming requires a solid grasp of localization and internationalization. These processes go beyond mere message translation; they encompass adapting software to different cultural contexts. This includes considerations like date and time formats, numeric representations, and text directionality—whether it's left-to-right or right-to-left writing systems. To ensure seamless global usability, standardized translations and robust localization processes are indispensable.

Another aspect worthy of attention is the use of Unicode in variable and function names. Embracing Unicode in these critical elements empowers developers to incorporate characters from diverse scripts and languages, enhancing code clarity and accessibility for a broad developer community. It's worth noting that most programming languages do not inherently provide built-in support for multilingual keywords within their specifications.

Development environments

To facilitate multilingual programming, it's imperative to incorporate native language support within development environments. This effort extends beyond mere translation; it's about enhancing the coding experience [11], making it more inclusive and efficient. Development environments should offer features such as syntax highlighting, autocompletion, and error checking in multiple languages. These capabilities not only aid developers in writing accurate code but also create an environment where non-English-speaking programmers can code comfortably in their native languages, fostering a sense of belonging.

Additionally, robust documentation and community support are fundamental pillars of multilingual programming. Clear and accessible documentation available in various languages facilitates learning and troubleshooting, while a diverse and inclusive community ensures that programmers from all corners of the world can seek assistance, share knowledge, and collaborate effectively.

Conclusion

Multilingual programming goes beyond mere translation; it demands a holistic approach that encompasses diverse language elements, cultural subtleties, and the creation of inclusive coding environments. Embracing tools, bolstering Unicode support, and adopting flexible syntax, all while championing native language assistance and comprehensive documentation, paves the way for a genuinely inclusive and efficient multilingual programming experience. By addressing these critical aspects, we can bridge linguistic divides and empower developers worldwide to collaborate effortlessly in their mother tongues.

References

  1. Coding Is for Everyone—as Long as You Speak English
  2. The next programming language is English
  3. Introducing English as the New Programming Language for Apache Spark
  4. Tech’s hottest new job: AI whisperer. No coding required.
  5. gettext
  6. Unicode
  7. The transition to multilingual programming
  8. Language-independent specification
  9. Non-English-based programming languages
  10. Natural Language Programming
  11. van der Storm, Tijs, and Jurgen J. Vinju. “Towards Multilingual Programming Environments.” Science of Computer Programming, vol. 97, Jan. 2015, pp. 143–49. ScienceDirect, doi:10.1016/j.scico.2013.11.041