Algorithms in the Information Age

By Richard Harth

The computer is the workhorse of the Information Age, executing tasks of astonishing variety and difficulty. But without cleverly designed commands, even supercomputers balk at the problems set before them—a challenge of growing complexity, ironically, due in part to the increasingly powerful technology that drives these machines. Enter a new suite of intelligent algorithms to guide the computers of tomorrow to their targets.

Algorithms in the Information Age

In a lively series of ads for the popular search engine Ask.com, “lame algorithms” are dismissed as the very essence of uncool (never mind a longer path from points A to B). The savvy marketing team behind the campaign hopes to infuse a term once consigned to the unfashionable world of geek speak with newfound hipness. The time may be right. As it happens, all sequential processes, from sorting laundry to finding a route up Mt. Everest, are actually algorithms in disguise. Algorithmic recipes lead migrating herds to their watering holes and direct the formation of snowflakes and hurricanes.

It is in the realm of computer science, however, where the algorithm reigns supreme. All machine programming is based on these lists of instructions, which begin at an initial input state and proceed to some point of termination, a path that revved-up technology is making more challenging to navigate.

For researchers in the IIT Department of Computer Science (CS), smarter algorithms are a fundamental tool of the trade, designed, analyzed, and applied to diverse specialties, including information retrieval, wireless networking, market forecasting, computer simulation, and cryptography.

Beginnings

Algorithms in the Information Age

The CS department came into being by fits and starts. IIT Professor Peter Lykos, recruited to the IIT Department of Chemistry in 1955, had a keen interest in early computing, honing his skills on Argonne National Laboratory’s IBM 650 computer. In 1959, Lykos introduced his chemistry majors to computing, instructing students to program a least-squares calculation of vapor pressure in a language known as Octal. Still, a fully fledged department devoted to computer science would have to wait.

In the 1960s, Charles Bauer, then assistant principal of Lane Technical High School, began teaching programming courses to Chicago students, using IIT’s facilities. The enthusiasm for computer science was infectious, with area teachers eventually enrolling in Bauer’s courses as well.

In 1971, with interest in computers mounting, the IIT CS department was founded. It quickly became the fifth largest department of the 25-department university, offering a range of CS classes, some of which persist today.

Stone Age to Digital Age

Algorithms in one form or another have been around since humanity’s earliest activities. The hunting algorithms used in modern search engines are, in some respects, remnants of the algorithmic strategies for finding game in prehistory. Computer scientists refer to “greedy algorithms,” “sorting algorithms,” “genetic algorithms,” and even “ant algorithms” (which mimic the statistical behavior of foraging insects).

The term algorithm dates to the Persian astronomer, mathematician, and geographer Muhammad ibn Musa al-Khwarizmi, whose ninth century treatise was translated into Latin three centuries later as Algoritmi de numero Indorum. Long before al-Khwarizmi, simple algorithms had found their way into an array of early computing devices, including the abacus (invented in Babylonia in the fourth century B.C.) and in the first century B.C. the Antikythera mechanism (a prediction tool for stellar and planetary motion).

By 1642, the philosopher-mathematician Blaise Pascal introduced a mechanical calculator. In the early 1820s, British mathematician Charles Babbage developed a large, steam-powered Difference Engine for calculating astronomical tables. Eventually, electricity, transistors, and microchip technology, along with a host of other innovations, revolutionized computers.

Today, these fleet-footed machines leaf through billions of Internet documents per second, model conditions of ocean and atmosphere a century into the future, and even act as digital matchmakers, drawing our soul mates from ever deepening tide pools of personal data. Algorithms form the dynamic core of these applications.

Algorithms in the Information Age
IIT computer science professors [left to right] Xiang-Yang Li, Ed Reingold, Xian-He Sun, and Sanjiv Kapoor

Building a Better Algorithm

While algorithmic maneuvers in our daily lives can seem effortless, cajoling a machine into executing similar calculations can be daunting. As IIT Professor Ed Reingold, a theoretician of algorithms, explains, “The computer sees a problem one little piece of data at a time and that complicates things enormously.”

A good algorithm has to accomplish its task quickly—it has to be efficient.

Imagine, for example, trying to find your way from New York’s Upper West Side to the Lower East Side. A few glances at a good city map would probably be enough to guide you to your destination. But the computer essentially has to examine the city’s maze of intersections one at a time, forgetting about those it has looked at already. As Reingold explains, “If you boil the problem down to its essentials, you have an algorithm that manipulates a graph—an idealized structure that has points and connections between the points.” Today, sophisticated graph algorithms are at the heart of programs like Mapquest and Google Maps.

IIT Professor Sanjiv Kapoor explains that algorithms need to do more than just provide accurate solutions. Simply put, a good algorithm has to accomplish its task quickly—it has to be efficient. This is the point where the art of algorithm design emerges.

Certain tasks like sorting playing cards by value and suit or arranging a group of seashells by color and shape can be handled rapidly by a variety of intelligent sorting algorithms. Adding more elements to be sifted and sorted increases the time required for the algorithm to finish the job, but the time increases almost linearly (actually, n log n). In mathematical jargon, such problems are solved in polynomial time.

Other problems are far thornier, requiring exponential time to solve. That can cost dearly, because increasing the elements to be examined can quickly swamp computing resources. Such challenges are known to computer scientists as NP-hard problems.

A classic case of NP-hard is the “traveling salesman problem,” or TSP. Here, a salesman is sent to visit a number of different cities on his rounds, returning to his starting place after stopping once at each destination. He wants to do this at the lowest cost (i.e., in the most efficient manner). The problem is easy to pose but much tougher to solve than one might think.

As the number of cities to be visited by the salesman increases, the time required for a simple algorithm to crack the problem goes up exponentially. Thus, an algorithm that can deliver a solution for a 10-city trip in less than a second requires about 20,000 years of high-speed computing time when the problem expands to 20 cities!

NP-hard problems are often subdued using approximation algorithms—those that give a good, but not necessarily optimum, solution. The traveling salesman problem is critical for computer scientists because it recurs in endless variations and disguises, often requiring shrewd algorithmic solutions.

Algorithms in the Information Age

Kapoor studies one such NP-hard problem: finding the equilibrium point in stock and commodity markets. As he explains, “The question was raised by the French economist Leon Walras in 1874. How do these markets actually get to an equilibrium point? Or do they ever reach this point?” Today, Kapoor and others are pursuing the question through methods including algorithmic game theory, a field at the intersection of economics and traditional computer science. Kapoor and his colleagues believe they have found a type of algorithm, known as an auction process, that can find the equilibrium price for a given set of traded commodities, given certain preconditions. This process occurs naturally in the real world. “We showed that this mechanism can be made to work in polynomial time,” Kapoor notes.

Networking

Evolving algorithms also play a key role in network transactions, which today make up more than half of all Internet traffic, as Associate Professor Xiang-Yang Li points out. Li is the director of IIT’s Wireless Networking Lab and applies his algorithmic design expertise to the unique challenges associated with network environments.

File-sharing programs such as Napster are among many so-called peer-to-peer (P2P) programs in which distributed users in the network act as both servers and receivers of information. One problem with these architectures is that individual users tend to act as selfish agents, maximizing their own benefits while in the process hindering the overall performance of the network. In his efforts to optimize the cost-benefit ratios intrinsic to network environments containing selfish users, Li also makes use of algorithmic game theory, among other strategies.

Such approaches to problem solving find application well beyond the field of computer science, occasionally entering the geopolitical sphere. President John F. Kennedy’s advisors, for example, deliberated over numerous algorithmic scenarios during the Cuban missile crisis, a classic case Li teaches to his students.

Algorithms in the Information Age
Ophir Frieder

Li and departmental colleagues Ophir Frieder, Nazli Goharian, David Grossman, Peng-Jun Wan, and Wai Gen Yee are members of IIT’s Information Retrieval Laboratory, established to attack the problem of accessing useful data from the gargantuan and ever-growing sea of digitized information, both online and off.

Frieder, director of the lab, insists that our search and retrieval headaches are far from over (notwithstanding the spectacular success of Google and its carefully guarded page-ranking algorithm). “People think that ‘search’ has been solved,” Frieder observes, taking issue with this common misconception. In addition to the needle-in-a-haystack quandary of finding relevant documents via standard search engines, there are all those other documents out there. Unlike the computer-readable pages Google hunts through, these “real-world” documents, as they are known, form a hodgepodge of scribbled notes, drawings, documents with watermarks or annotations, texts on corrupted paper, and so forth.

One of many projects Frieder and his colleagues in information retrieval have worked on is the Complex Document Information Processing prototype that opened the world of real-life documents to powerful search capabilities. As a test bed for this software, the group used the Legacy Tobacco Documents Library, a storehouse of 42 million document images (roughly 1.5 terabytes) relating to the famous United States tobacco litigation. Many of the documents include handwriting, tables, signatures, and graphics along with standard text. The National Institute of Standards and Technology is already using some of the outcomes of this research in its efforts.

Other research in progress through the Information Retrieval Lab includes search technology that integrates multiple data types via IIT’s patented Intranet Mediator, search technology for Arabic language materials, detection of computer misuse, sophisticated refinements in data mining, and P2P advances.

Into the Spotlight

Algorithms are presently undergoing a sea change, largely related to two technological trends.

The first has to do with Moore’s Law, an observation of computer hardware development over time. Though the pace of advance has been remarkable—with processing speed doubling roughly every 18 months—this windfall cannot continue indefinitely. Physical limitations will curb the design of faster microchips. Once this wall is reached, only better algorithms can improve performance, at least in traditional computing.

The second trend—known as parallel processing—arose in part to circumvent these physical limitations. With parallel computing, many processors simultaneously chew away at a problem, rather than a single CPU working through its paces sequentially. The power of this technique has been enormous. Of course, there is a hitch: parallel computing also requires sophisticated new algorithms.

“It’s impossible to know today what will bear fruit in a decade or five decades.” —Ed Reingold

Parallel computing is a specialty of IIT Professor Xian-He Sun, who works on the formidable challenges of designing parallel processing algorithms. He points out that parallel algorithms are becoming a necessity, not only in the rarified world of supercomputing but also for personal computing as well. New PCs with multiple CPUs built into so-called multi-core chips have already begun to shrink parallel computing down to size.

Among Sun’s early breakthroughs were algorithms designed for tridiagonal systems. These algorithms have been included in IBM’s Parallel Engineering and Scientific Software Library and adopted as a community standard.

More recently, Sun has attacked the problem of maintaining communications among multiple users operating on mobile devices such as PALM or other PDAs. When participants interact in real time—for example, in an online game—the communications to and from a user can be lost if the member moves out of the range of the particular wireless service station.

The situation becomes much more complicated when multiple users are moving dynamically. Sun’s new protocol—an algorithm for coordination—maintains communication states under process migration, so that interactions among multiple users can operate seamlessly. He and his student received a national award for this contribution.

Forward March

Reingold is adamant that theoretical research is crucial to the future of computer science. He points to earlier discoveries in pure mathematics that have since moved to the technological forefront. “It’s impossible to know today what will bear fruit in a decade or five decades,” he stresses. “That’s one reason bluesky research can turn out to be very important. Cryptography is a terrific example of that.”

Composed of mathematics, not silicon, algorithms are endlessly flexible structures. In addition to their pride of place in computer science, algorithmic processes are embedded in nature, choreographing the movements of flocks of birds in flight and shoals of fishes under the sea. They are without limit, and will continue to shape the world.


Notable Dates: Computer Science at IIT

1959 Peter Lykos introduces IIT chemistry majors to computing. It is the first known case in the United States of computer programming appearing as a requirement for a major course, according to the Charles Babbage Institute.

1964 Charles Bauer, assistant principal at Lane Technical High School, begins teaching Saturday morning programming courses for area high school students.

1967 Lykos establishes an M.S. program in computer science at IIT, from which there are graduates who pre-date the founding of a formal IIT Department of Computer Science (CS).

1971 The IIT CS department is founded, offering 35 undergraduate and graduate courses, making it the fifth-largest department in a 25 department university and the first CS department in the Chicago area. Robert Tobey is hired from Argonne National Laboratory as the first departmental chair.

1976 Carma McClure is awarded the first CS doctorate at IIT. McClure has written a number of books concerning the representation of algorithms using diagrams and words to more clearly specify programs and improve usability and maintenance.

1985 Bauer retires from the Chicago public school system and becomes an associate professor of computer science at IIT. He remains at IIT as the longest-serving CS faculty member.

Today With more than 600 students, the CS department is the largest unit within the IIT College of Science and Letters and one of the top 10 producers of computer science master’s degrees in the United States. Graduates of the department have attained positions at Amazon, Google, Microsoft, Motorola, Wolfram Research, and other leading companies.