Leveraging the Power of Modularized Programming
Leveraging the Power of Modularized Programming
For Use in Agent-Based Modeling
Michael Ball, B.S.
Complexity in Health Group
Kent State University at Ashtabula
In this paper, I will first give a brief summary of the evolution of computer programming languages, from their beginnings where programming consisted of single lines of machine code entered manually via punch cards to the present day use of advanced graphical user interface object-based programming utilities and languages. I will then examine the advantages of using discrete modules of code to facilitate easier and more rapid development and programming of agent-based models for use in computational modeling. As proof of concept, I will discuss the development tracks of several agent-based models that I have programmed in NetLogo, showing how code modules developed for one model can be used in another model with minimal modifications, decreasing the amount of development time required for the creation of new models.
From Punch Cards to Turtles
As modern computer hardware has evolved over the past six decades, the programming methods and programming languages needed to interact with these devices has evolved as well. Even before the development of modern computers, computational devices have required programming of some sort to function. Examples of such programming can be traced as far back as the ancient Greeks. The mysterious Antikythera mechanism, which has been dubbed one of the world’s oldest computers, was recovered from the wreck of a Roman Corsair off the southern coast of Greece in 1901. The device, which dates back to about 100 B.C., used “mechanical programming” to display the motions of the Moon and stars. “The device employs an elaborate arrangement of more than 30 gears for its calculations. The level of miniaturization and complexity is remarkable, with some parts resembling those used by 18th-century clockmakers“ (Ravilious).
Figure 1: Antikythera Mechanism
The Antikythera is not alone. Other examples of ancient mechanically programmed computing devices include; the Planisphere and other mechanical computing devices invented by Abū Rayhān al-Bīrūnī (c. AD 1000); the Equatorium and Universal Latitude-independent Astrolabe by Abū Ishāq Ibrāhīm al-Zarqālī (c. AD 1015); the astronomical analog computers of other medieval Muslim astronomers and engineers; and the Astronomical Clock Tower of Su Song (c. AD 1090) during the Song Dynasty.
Moving forward to the late 1800’s, we see a more recognizable method of programming being developed, the punch card. “Herman Hollerith is widely regarded as the father of modern automatic computation. He chose the punched card as the basis for storing and processing information and he built the first punched-card tabulating and sorting machines as well as the first key punch, and he founded the company that was to become IBM. Hollerith’s designs dominated the computing landscape for almost 100 years” (da Cruz). Hollerith also paved the way for future computational devices to be able to perform more than one function by introducing a wiring panel in his 1906 Type I Tabulator, allowing it to do different jobs without having to be rebuilt. This form of programming has stood the test of time. Punched cards are still used and manufactured to this day, and their distinctive dimensions (and 80-column capacity) can still be recognized in forms, records, and programs around the world. They are the size of American paper currency in Hollerith’s time, a choice he made because there was already equipment available to handle bills.
By the 1940’s, the first electrically powered digital computers had been developed. These new devices had (for the time) an immensely greater computing capacity that pushed the punch card input method to its limits. This prompted the development of high-level programming languages. “The first high-level programming language to be designed for a computer was Plankalkül, developed for the German Z3 by Konrad Zuse between 1943 and 1945” (Rojas). High-level programming languages went through several generations of increasing complexity. Computers from the early 1950’s, such as UNIVAC I and IBM 701, used first generation language (1GL) also referred to as machine language programs. 1GL programming was quickly superseded by similarly machine-specific, but mnemonic, second generation languages (2GL) known as assembly languages or “assemblers”. By the late 1950’s, 2GL programming had evolved to include the use of macro instructions. Third generation programming languages (3GL), such as FORTRAN, LISP, and COBOL are more abstract and are “portable”, meaning that they are implemented similarly on computers that do not support the same native machine code. Updated versions of these 3GLs and many others not mentioned are still in general use today. O’Reilly Media has compiled an extensive graphic, showing the parallel development of programming languages over the last 60 years (see figure below). The full size graphic can be viewed at http://oreilly.com/news/graphics/prog_lang_poster.pdf
Figure 2: History of Programming Languages
During the 1960 – 1970 periods, the programming industry saw the development of the major language paradigms now in use, many aspects of which were refinements of ideas introduced by the very first Third-generation programming languages. Examples of this include:
- APL – This introduced array programming and influenced functional programming.
- PL/I (NPL) was designed in the early 1960s to incorporate the best ideas from FORTRAN and COBOL.
- In the 1960s, SIMULA was the first language designed to support object-oriented programming
- In the mid-1970s, SMALLTALK developed the first “purely” object-oriented language.
- C was developed between 1969 and 1973 as a system programming language.
- PROLOG, designed in 1972, was the first logic programming language.
- In 1978, ML built a polymorphic type system on top of LISP, pioneering statically typed functional programming languages.
The 1980’s were years of relative consolidation. The introduction of C++ combined object-oriented and systems programming. ML and LISP both moved forward as programming community standards. Rather than inventing new paradigms, the majority of work elaborated upon the ideas invented in the previous decade. However, a major development in language design for programming large-scale systems during the 1980’s was an increased focus on the use of modules, or large-scale organizational units of code. MODULA-2, ADA, and ML were all early adopters of this new modular programming scheme. Today’s large-system programming languages continue to evolve, in both industry and research. Directions being explored include security (biometrics) and reliability verification (redundant data caching), new kinds of modularity (mixins, delegates, aspects), and database integration such as Microsoft’s LINQ.
The last big leap that brought us to the current level in programming circles was the development of Graphical User Interface (GUI) programming tools to aid programmers in the development of code in many programming languages. Microsoft’s Visual Studio is a prime example of this. Visual Studio provides a centralized GUI that allows the programmer to code in C++, C#, Visual Basic and Java. Other examples include JavaBeans, RUBY, NetLogo, and MATLAB. These advanced tools have made the chore of programming easier, but without a clear-cut set of programming guidelines and a rigorous, almost fanatical, mindset toward documentation, a programmer won’t realize the full potential of the tools.
Clean Coding, Documentation, and the Snippet Library
Before the development of object-based programming languages, programs usually followed a linear-branching model, both when being parsed by a computer and when read by a programmer. A good example would be a program written in BASIC with its sequential line numbers and branching jumps to sub-routines. Even without documentation, a programmer unfamiliar with the program could easily trace the run path. However, the introduction of object-based programming and its ability to call multiple discrete objects from the main program thread simultaneously added a new level of complexity to the programming. Because of this, maintaining a clean coding style and documenting your code became very important.
All programming languages have widely accepted code standards and best practices that lend themselves to efficient coding. To go into detail on these various standards and practices would be beyond the scope of this paper. Suffice it to say, that following the standards and best practices for the language you are programming in is a good first step toward clean coding. The second step is, of course, consistency. Don’t change your style of parsing, indentation, or object naming part way through a program.
The next most important procedure is documentation. Without properly documenting your code, you can fall victim to the often quoted Eagleson’s Law which states, “Any code of your own that you haven’t looked at for six or more months might as well have been written by someone else.” The detail level and overall amount of documentation is obviously a matter of personal taste, however, even a minimal, consistent level of code documentation can go a long way toward keeping Eagleson’s Law at bay. At a minimum, your documentation should include the following; all variables should have a concise description, each object and/or module should be defined, and all core procedures should be notated.
To be even more effective, your program’s documentation should include a brief description of the functions the program was written to perform, as well as, a brief version history to track changes made to the program as it evolves through various iterations. This documentation can be placed in the header space at the beginning of the code. Any copyright information and/or author details, can be placed here as well, or annotated at the end of the code.
By following these two practices, clean coding and documentation, the programmer has firm base for the development and use of a major time saving tool that facilitates the re-use of modularized code. This tool is the Snippet Library – a collection of code snippets, objects, classes and modules that the programmer can use in the development of new code, instead of having to re-invent the wheel every time. Admittedly, this is not a new concept. Programmers have been re-using code since the dawn of programming. There are even a multitude of websites containing Snippet Libraries for a vast majority of the programming languages in use today. The site at http://www.snippetlibrary.com/ is a good example. At the personal level, the programmer can also create and maintain their own Snippet Library for each of the languages that they program in.
These concepts are useful at both the macro level and the micro level. They provide optimization of code when used in the creation of large, complex modules in such programming languages as MODULA-2 and ADA, as well as, the creation of smaller discrete class objects in programming languages like JAVA and C++. They can even provide benefits when used in the writing of code for markup languages such as HTML and XML.
Agent-Based Model Development Using Code Modules
During the course of my work with the Complexity in Health Group (CHG) located on the Kent State University at Ashtabula campus (http://cch.ashtabula.kent.edu/), I have been developing agent-based models using NetLogo, a java-based programming GUI developed at Northwestern University. “NetLogo is a programmable modeling environment for simulating natural and social phenomena. It was authored by Uri Wilensky in 1999 and has been in continuous development ever since at the Center for Connected Learning and Computer-Based Modeling” (Wilensky). NetLogo provides an excellent platform for building modularized program code. Even though NetLogo is a statement-based programming platform that is specifically geared toward the field of computational modeling, it benefits greatly from the concepts of modularization.
The Infectious Disease Model that I have just completed is an excellent example this. A working copy of the model and a complete listing of the code can be found at www.personal.kent.edu/~mdball/Infectious_Disease_Model.htm. The development of this model was aided by the use of code modules developed in previous models. Before going into examining the code itself, the following block of code gives an example of clean coding style and documentation.
;;Infectious Disease Model ver. 1
;;This model simulates the spread of an infectious disease traveling via contact through
;;a randomly moving population. The user can draw walls, buildings, or obstacles in the
;;environment to simulate different environments.
breed [healthy] ;;Different breeds of turtles to show heatlh state.
globals [ ;; Global variables.
turtles-own [ ;; Turtle variables.
to building-draw ;; Use the mouse to draw buildings.
[ask patch mouse-xcor mouse-ycor
[ set pcolor grey ]]
As you can see, the header of the program code contains a brief description of the program. All similar types of variables are grouped together with appropriate descriptions. And the module at the bottom has its own descriptor, as well.
The module at the bottom of the block is also an example of a re-used module. This module, as you can see from the description, allows the user to draw buildings in the simulation grid when running the model. It was obtained from the code example library developed by Dr. Wilensky that is included with the NetLogo environment. Sometimes, a code snippet can be transplanted to a new program with no modification. This is usually the case when the code snippet involved is a global action or primitive statement that does not make use of specific, unique variables. In the case of a code snippet or module that involves specific or unique variables, modification is then necessary. I first developed the following module for a zombie simulation (http://www.personal.kent.edu/~mdball/zombies1_4.htm) that I developed.
to wander ;; If an agent is not fleeing or chasing, have them wander around aimlessly. set turn-check random 20 if turn-check > 15 [right-turn] if turn-check < 5 [left-turn] if [pcolor] of patch-ahead 1 != black [wall] ask zombie [ if any? other turtles-here [ convert] ] end (Ball)
This module enabled the agents to move randomly around the simulation grid. By changing the variable (zombie) that pointed to the agent type, I was able to use the module in several other models. One of these being the Summit-Sim model we developed for our study of Summit County. In this study, “our goal has been to understand how the 20 communities of Summit County function as a complex system and the impact this county-level system has had on the health of its various communities” (Castellani, et al). The model has undergone several revisions and updates, which have been fully documented. This process has enabled me to use modules from this model to go back and optimize other models that were developed previously. The complete code and a working copy of this model can be found at www.personal.kent.edu/~mdball/pareto_schelling_mobility.htm.
The goals of the Complexity in Health Group are to “promote the application of complexity science to the study of health and health care through a cross-disciplinary program of teaching, training and research. The CHG’s application of complexity science includes complex systems thinking, computational modeling, network analysis, data mining, and qualitative and historical approaches to complexity. The CHG is specifically committed to collaborating with health care centers and practitioners in Ashtabula County, Ohio; and to students and faculty at Kent State University” (CHG).
In order to assist these collaborators in developing agent-based models of their own, I am also in the process of finalizing the development of a formalized Snippet Library of NetLogo modules that will be available on the CHG website in a searchable database format. The table below shows the current data fields and several examples of module data.
Models Used In
|Environment||10/10/2010||M Ball||to building-draw ;; Use the mouse to draw buildings.
ask patch mouse-xcor mouse-ycor
[ set pcolor grey ]]
|Infectious Disease 1.0, Zombieland 1.4, Summit-Sim 3.0|
|Movement||9/15/2010||M Ball||to wall ;;turn agent away from wall
set wall-turn-check random 10
if wall-turn-check >= 6
if wall-turn-check <= 5
|Zombieland 1.4, Infectious Disease 1.0, Summit-Sim 3.0|
|Reporting||8/23/2010||M Ball||to update-globals ;; Calculate agent states each tick.
set percent-unhappy (count turtles with [not happy?]) / (count turtles) * 100
set unhappy-rich (count rich with [not happy?]) / (count rich) * 100
When complete and deployed to the CHG website, the database will be available to Group research members for use in developing agent-based models.
It is clear that clean coding, documentation and the use of a snippet library, while simple concepts on their own, can be used together to save large amounts of development time in the course of a project. Over the course of developing models for the Group, I have found that the use of these methods has reduced my model development time significantly. The ability to re-use previously developed code in new models is the true power of modularization. And as the Snippet Library continues to be refined and grow in depth, this level of productivity can only increase further.
/ The Snippet Library, Home Page.. (n.d.). / The Snippet Library, Your one stop shop for programming.. Retrieved December 9, 2010, from http://www.snippetlibrary.com/
Ball, M. (n.d.). Infectious Disease Model. The Tipping Point. Retrieved December 9, 2010, from http://www.personal.kent.edu/~mdball/Infectious_Disease_Model.htm
Ball, M. (n.d.). Summit-Sim/NetLogo. The Tipping Point. Retrieved December 9, 2010, from http://www.personal.kent.edu/~mdball/pareto_schelling_mobility.htm
Ball, M. (n.d.). Zombies in NetLogo. The Tipping Point. Retrieved December 9, 2010, from http://www.personal.kent.edu/~mdball/zombies1_4.htm
Castellani, B., Hafferty, F., & Ball, M. (2010). E-Social Science from a Systems Perspective: Applying the SACS Toolkit. Journal of Sociocybernetics, 7(2), 89-106. Retrieved December 5, 2010, from http://www.unizar.es/sociocybernetics/Journal/JoS7-2-2009.pdf
Complexity in Health Group Mission Statement. (n.d.). Complexity in Health Group at the Robert S. Morrison Health Sciences Building, Kent State University. Retrieved December 7, 2010, from http://cch.ashtabula.kent.edu/
da Cruz, F. (n.d.). Herman Hollerith Tabulating Machine. Columbia University in the City of New York. Retrieved November 23, 2010, from http://www.columbia.edu/acis/history/hollerith.html
O’Reilly Media – Technology Books, Tech Conferences, IT Courses, News. (n.d.). O’Reilly Media – Technology Books, Tech Conferences, IT Courses, News. Retrieved December 3, 2010, from http://oreilly.com
Ravilious, K. (n.d.). Ancient Greek Computer’s Inner Workings Deciphered. Daily Nature and Science News and Headlines | National Geographic News. Retrieved November 23, 2010, from http://news.nationalgeographic.com/news/2006/11/061129-ancient-greece.html
Rojas, R. (n.d.). Konrad ZuseÂ’s Plankalkuel and its Compiler. Zuse-Institut Berlin: Home. Retrieved December 3, 2010, from http://www.zib.de/zuse/Inhalt/Programme/Plankalkuel/Plankalkuel-Report/Plankalkuel-Report.htm
Wilensky, U. (n.d.). NetLogo Home Page. The Center for Connected Learning and Computer-Based Modeling. Retrieved December 8, 2010, from http://ccl.northwestern.edu/netlogo/