Monday, February 28, 2005

 

Let’s dance

The talking time is over. Yes, the Chance Discovery Consortium researchers are back in a white and cold Urbana-Champaign. After a 20 hours flight from Japan, they were at 9am sharp in the lab. It has been a long day. I do not know how they deal with their jet lag. We have been discussing the details of the methodology we are going to use for the experiment. The photo below shows (left to right, Yuichi Washida, Hiroshi Tamura, & Yukio Ohsawa) a little break during the discussion of a hot topic, the crafting and presenting of emergent scenarios based on KeyGraphs of a given communication. Starting tomorrow afternoon, the four-day experiment marathon will be at its max---reaching its climax on Wednesday afternoon---and we are running out of time to get everything ready.





Luckily, as I already mentioned on my previous post, nothing will be happening without the unconditional help and support of Chen-Ju Chao, Abhimanyu Gupta, Mohit Jolly, and Davina Lim. This DISCUS group, besides of their hard work, has been a great example of innovation and creativity, in the best DISCUS flavor.

The milestone of the day: Chen-Ju, below, just ten minutes away from nailing down the assistance of all the 50 students that will be part of the marathon. All in less than three days :D




 

Pot pourri of GA applications

Here in no particular order (or consistent selection criteria) are a number of sites featuring GA applications in chemistry, e-commerce, economics, the solution of ordinary differential equations, and water system design. Michigan State University has a nice listing of companies doing genetic algorithms and evolutionary computation here. The forum here claims to have the "Index of Most Important Applications of the Genetic Algorithms."

Sunday, February 27, 2005

 

About writing presumptions in papers

While I was reviewing GECCO papers, it reminded me of different styles of writing presumptions. Some researchers write presumptions implicitly, and others write them explicitly. However, I found that some people misunderstood the meaning of those presumptions. For example, one of presumptions of Dirk Thierens' convergence-time model (1994) is infinite population size, and I've heard people think the result is not useful because infinite population is unrealistic in real problems. However, it is already an good approximation for even not too large population size (central limit theory). Of course, there exist many papers about finite population size correction, and they have different strength and weakness (usually more accurate predictions and less insights, see Dave's post here.). But my point is, the results in a paper might still be very helpful for your applications even if not every presumptions is fulfilled, and I like it when people write clearly about their presumptions because I feel that they really know what they are talking about.

 

Catching a Portugese digital fish

Caught the blog pescadordigital during a www.technorati.com fishing expedition. For those who don't speak Portugese (like me), use Google here to translate the page. The blog refers to another blog called webservcga that appears to be dedicated to reporting a school project at the University of Algarve (great windsurfing I'm told) involving a distributed compact genetic algorithm (CGA) in Java. All this surely traces back to IlliGAL Blogging blogger, Fernando Lobo, althought the Portugese Digital Fisherman does not identify him/herself explicitly.

 

GeneticGraph for aesthetic layout

Blog vas animatum has a link to GeneticGraph, a system that uses a spring algorithm in hybrid with a genetic algorithm to do aesthetic graph layout.

 

CiteULike

Machine Learning, Etc. has a useful post on CiteULike, a social citation system that looks pretty interesting.

 

More GA thesis blogging

Eagle-eyed blogger Yaroslav Bulatov of Machine Learning, Etc. left a comment to my post on thesis blogging about Trash Nexus, another genetic algorithm thesis blogger.

Saturday, February 26, 2005

 

Clustering for good

The DISCUS experiment continues with a firm pace. We have collected enough media survey questionnaires. After examining the questionnaires, Yuichi Washida has clustered the participants after his research achievements. In his own words on diffusion patterns of knowledge:
“Our arguments consist of two hypotheses: One is the difference of function by gender as a medium of knowledge diffusion in Japan's actual product market, while the other is the difference by generations. We found that certain generation and gender groups play important roles at the early stage of knowledge diffusion regarding new types of products with new technologies, which enable manufacturers to get effective feedback in creating and improving their products. We argue that the social network of this rapid and divergent feedback from the consumer side may explain why Japanese electronic manufacturers can realize high-quality products in short time periods.”
Questionnaires were grouped in four different clusters. The experiments that we will be conducting next week involve having enough representatives of each of those clusters. Scheduling people in groups, and making sure that they will be in, is a time consuming job---not to mention a tough scheduling issue.



The job has another added element of difficulty, the unbalanced distribution of participants among the different clusters. The board above shows the unbalanced distribution we face in two of the four clusters. Thus, for the success of the whole experiment, Chen-Ju Chao and Davina Lim---you can see them below---have been working hard to make sure that there are enough participants available from each cluster.



 

Panda uses "genetic" heuristic engine

Panda Software, a maker of computer security software, now includes a genetic heuristic engine to detect previously unknown malware without updates or signature files:

Panda has extended its TruPrevent(TM) Technologies and announces the availability of its most innovative heuristic technology against unknown threats: Genetic Heuristic Engine. This new technology integrates correlation of genetic digital signatures and deep code inspection in a single algorithm, patent pending, which scans the code and DNA traces typical of malware.

During the testing period, this technology has demonstrated remarkable effectiveness, detecting hundreds of new unknown malicious programs (including spyware) that have emerged over the last few months, without needing updates or signatures. By working alongside the rest of the TruPrevent(TM) Technologies, its detection capacity is maximized while false positives are minimized and with a minimum impact on system performance.
The company website is sufficiently obscure as to prevent this reader from knowomg whether the term "genetic" is used in the sense of a genetic algorithm or in the sense of a genetic metaphor. Do any IlliGAL Blogging readers know more?

 

GAs, swimming & data mining

Swimming science claims that the use of a simple GA to select features for data mining of data about swimmers helped improve classification accuracy by as much as 25.37%. Read the abstract of the study here.

 

Genetic Algorithm blog blogs GAs

Just found a blog named Genetic Algorithm on www.technorati.com. I missed it previously, because I have been searching on "genetic algorithms" (plural). Oops. The owner is apparently working on a thesis devoted to GAs and the traveling salesman problem. Lot's of water over that dam. Hope he/she considers using competent crossover/mutation per The Design of Innovation.

 

John Searle

A previous post remarked favorably on some of John Searle's work (let's sidestep the Chinese room for now). Here's a picture:



The picture above is one I've seen in book jackets before (linked from www.kurzweilai.net).

 

Making peace with postmodern thought

Julian Garcia links to the Postmodernism Generator, a site that uses AI techniques to generate sensibly insensible postmodernist essays. Scientists find this kind of thing enormously amusing since Alan Sokal's famous hoax, but I wonder if the joke isn't on us. The joke apparently is that postmodern argumentation has an involved set of semantic conventions and intricate theoretical constructs that look strange to the uninitiated, and the Sokal hoax showed that even the initiated have difficulty identifying the real thing.

But the essence of postmodernism (if such a thing can be said to exist) isn't really all that weird. It seems to me that the core of postmodernism is the acceptance of the principle that key facts of life are socially determined (little things like language, money, political and social institutions, that sort of thing) and our agreement (or disagreement) about their constitution is an integral part of their reality.

Of course, if this sort of thing is taken too far, it is a slippery and unacceptable slope toward solipsism, and from the point of view of an engineer or scientist, physics seems pretty darn real irrespective of the observer (Thomas Kuhn and his paradigm shifts notwithstanding). So what's a nice postmodern engineer to do in making peace with the postmodern world?

It seems to me that one sensible thing to do is to go read John Searle's account of all this in The Construction of Social Reality. Searle's brilliant argument preserves physics (brute facts) and delineates them from social facts in a rigorous manner. As with many philosophers, Searle's argumentation is not for the weak of heart, and it is not a light book to be read at the beach. Nonetheless, it seems like just the right antidote for those who might be tempted to take the postmodernism-as-joke thesis a bit too far.

 

Kevin Kelly lists advances in scientific method

Kevin Kelly, co-founder of Wired and futurist, has an interesting list of relatively recent innovations in the scientific method. To compile his list, he polled an eclectic group of scientific movers and shakers, and the result is posted here. The list includes "pattern mining," but it does not include automated invention via genetic algorithms or genetic programming. Pardon me for asking, but isn't the effective automation of the innovative/creative part of the scientific method kind of a big thing?

 

PhD scholarship for Indian nationals at Politecnico di Milano

This is from Pier-Luca Lanzi:
In July, the Politecnico di Milano will have two PhD studentships in Computer Engineering for Indian candidates. The scholarship includes around 850 euros and accommodation in student dorms.
Those interested, please contact Pier-Luca for additional information about the application procedure.

 

Genetic programming and population sizing

The development of population-sizing models for genetic algorithms (Goldberg, Deb, & Clark, 1992; Harik, Cantu-Paz, Goldberg, & Miller, 1999) was important for better understanding of genetic algorithms. Last year, Kumara Sastry, Una-May O'Reilly, and David E. Goldberg published a paper on population sizing for genetic programming based upon decision making. I think that this is one of the most important theoretical results in genetic programming and I think that everyone who is seriously interested in understanding genetic programming should have a look at the paper. To let the authors speak for themselves, here's a part of the abstract of this paper:

This paper derives a population sizing relationship for genetic programming (GP). Following the population-sizing derivation for genetic algorithms in Goldberg, Deb, and Clark (1992), it considers building block decision making as a key facet. The analysis yields a GP-unique relationship because it has to account for bloat and for the fact that GP solutions often use subsolutions multiple times. The population-sizing relationship depends upon tree size, solution complexity, problem di±culty and building block expression probability.

 

Dad, is that a good thing?

I was doing some google diving for old blog posts on genetic algorithms and came across an oldie but a goodie that brought back a funny memory. In 1997, I was interviewed by Jessie Scanlon then of Wired Magazine for an article called Mother Nature on a Motherboard. The article was published in the February 1997 issue on the Geek Page. I was pretty proud of myself, so I got a copy and brought it home. I was showing the article to my older boy Max, who had just turned 8, and said, "Max, here's an article about your dad on the Geek Page of Wired Magazine. How 'bout that?" He paused for a minute, his face got serious, and he replied, "Dad, are you sure the Geek Page is a good thing?"

Friday, February 25, 2005

 

Genetic and evolutionary computation bibliographies

The following is a post from GP List:
The Artificial Evolution Conference has been recently added to the collection of EC bibliographies. Additionally, the Artificial Evolution Bibliography can be searched via the search interface.
A searchable list of computer-science related bibliographies can be found here, and EC related bibliographies in BibTeX format can be found here. Citeseer, DBLP, and arXiv.org are other good online resources.

 

Heard on the GA street

Future feeder has a post on genetic programming and human-competitiveness. Team popcorn appears to be doing a multiple-car race simulation using genetic algorithms here. Kerneltrap discusses the release of a new version of the genetic library patches for tuning the Linux kernel. ForwardMarkets has a post on using agents and genetic algorithms for calculations of strategic bidding in business. Maverick wonders why we can't create a real artificial intelligence given NNs, GAs, AI, ML, and all the rest. Finally, I never would have guessed how frequently people spell the word illegal as illigal on their blogs.

Thursday, February 24, 2005

 

Business blogging on the rise

In a previous post, I commented on the dearth and possible rise of academic blogging. In a related vein, Franz Dill has been blogging about the rise of business blogging, both in- and outside the walls of corporations. His latest post is over at IFTF's Future Now. A particularly good catch was his link to Business Blog Consulting. Franz is also blogger-in-chief over at The Eponymous Pickle, where today he has a short post on agent-based modeling.

 

Rubik's cube via EC

Virialexpansion wonders

I could probably use EC to attempt to solve the optimization version of that problem. There are still some more definitions that need to be made; if they can be quantified, then it could be attempted. I wonder if EC could find the theoretical minimum of T, not only the cardinality, but the actual moves? If it could, how would you prove it, though? The problem with that is that there are 43 quintillion different configurations of the cube... Ouch.

I saw an EC approach to Rubik's cube back in 1997 when I visited Ingo Rechenberg's Bionik und Evolutionstechnik laboratory at the Technische Universitaet Berlin as part of my first sabbatical. The link here has an Evolution Strategie approach to solving the cube and 9 other ES demos dating back to PPSN 1994.


.


The image above is of Ingo Rechenberg linked from his web page.

Wednesday, February 23, 2005

 

Ready, steady,…

The big day for DISCUS is coming. We have been collecting the media survey questionnaires for our colleagues in Hakuhodo and the Chance Discovery Consortium for more than a week. Now that we have the questionnaires, we will start contacting the people interested to join the focus groups for detailed discussions about how they use internet, email, and cell phones. From Monday to Friday, the lab will not look as empty as the picture below.



The only channel of communication among participants will be the computer-mediated one provided by DISCUS. Participants will be sitting at the lab workstations using DISCUS---yes, I will post some pictures with a lab running at full steam. No other communication among participants will be available. This way, we will be able to analyze totally self-contained communications using KeyGraph and Influence Diffusion Models.



However, this is not a one man’s job. I must thank all the DISCUS people. Without them, we would not even be able to think of doing the experiment we will be conducting next week. They are making possible this experiment a reality, taking even care of the smallest logistics details.

 

Stewart Wilson: Mover and shaker of LCSs

I started a photo tour of genetic algorithmists and computational evolutionaries with my image of John Koza a few posts ago. Here's an image of the prime mover of learning classifier systems (LCSs), Stewart Wilson:

.

The picture I wanted to post had Stewart wearing sunglasses, and they were a nice touch. But knowing Stewart they were for UV protection not the pursuit of cool.

 

Lemmings fans dig GAs

A team at Nottingham has developed a genetic algorithm to play the game of lemmings. Download the paper here. Hat tip to The Ludologist.

 

GAUL package updated

Freshmeat has a report that The Genetic Algorithm Utility Library (GAUL) has been updated:
Numerous additions and improvements were made. Most notably, island-model genetic algorithms are now available as parallel versions using either MPI or pthreads. Several new demonstration programs were added to the distribution.

More information is available on the GAUL homepage here.

 

GA used in lung cancer understanding

The New England Journal of Medicine reports the use of a genetic algorithm in better understanding a form of lung cancer. In an article entitled EGFR Mutation and Resistance of Non–Small-Cell Lung Cancer to Gefitinib, the authors report using GAs fairly routinely as part of structural modeling procedures:

The crystallographic structure of the EGFR tyrosine kinase domain, solved in complex with erlotinib, was used as a model for the prediction of kinase-inhibitor binding (Protein Data Bank accession code 1M17 [PDB] ).14 The inhibitor and solvent were stripped from the model. We used the AutoDock program, version 3.0,15 to predict binding, first using a model of erlotinib, made by means of the JME molecular-editing feature of the online resource PRODRG.16 The erlotinib test yielded a model for ligand binding highly similar to that seen in the crystal structure. Using the AutoDockTools interface, we used a grid spacing of 0.375Å and 60x50x40 points centered around the catalytic cleft of the enzyme for docking and adopted the genetic algorithm with local search using default settings. Gefitinib and CL-387,785 were then docked with the use of the same protocol. To illustrate potential inhibitor clashes with the T790M mutant, we prepared figures in which threonine at position 790 (T790) is mutated to methionine. We then chose the lowest free-energy cluster that overlapped in the quinazoline moiety with the crystallographic coordinates found for erlotinib binding.

This work cites earlier work in the following reference:

Morris GM, Goodsell DS, Halliday RS, et al. Automated docking using a Lamarckian genetic algorithm and empirical binding free energy function. J Comput Chem 1998;19:1639-1662.

Additional details are available here.

 

New IlliGAL technical reports

IlliGAL is pleased to announce the publication of following technical reports:

Lanzi, P.-L., Loaicono, D., Wilson, S. W., Goldberg, D. E. (2005). Extending XCSF Beyond Linear Approximation. IlliGAL Report No. 2005006. (Abstract) (Full paper in PS) (Full paper in PDF)

Lanzi, P.-L., Loaicono, D., Wilson, S. W., Goldberg, D. E. (2005). Extending XCSF XCS with Computable Prediction for the Learning of Boolean Functions. IlliGAL Report No. 2005007. (Abstract) (Full paper in PS) (Full paper in PDF)

Lanzi, P.-L., Loaicono, D., Wilson, S. W., Goldberg, D. E. (2005). XCS with Computable Prediction in Multistep Environments. IlliGAL Report No. 2005008. (Abstract) (Full paper in PS) (Full paper in PDF)

Llorà, X., Sastry, K., Goldberg, D. E., Gupta, A., Lakshmi, L. (2005). Combating User Fatigue in iGAs: Partial Ordering, Support Vector Machines, and Synthetic Fitness. IlliGAL Report No. 2005009. (Abstract) (Full paper in PS) (Full paper in PDF)

Other IlliGAL technical reports and publications are available here.

 

Man in the striped shirt

Genetic programming (GP) inventor John Koza has an unexplained penchant for wearing striped shirts regardless of the occasion.



Red stripes are his clear favorite, although the picture (linked from www.genetic-programming.org) shows a dashing gray-blue stripe. Must have been a special engagement or holiday.

 

$10,000 in prizes for GEC human-competitive results

John Koza and Third Millennium Online Products, Inc. announce a competition for human-competitive results produced by genetic and evolutionary computation. The deadline for the competition is 20 June 2005 and the announcement may be seen here. According to the competition, a result may be considered human-competitive if it satisfies one of 8 criteria:
  1. The result was patented as an invention in the past, is an improvement over a patented invention, or would qualify today as a patentable new invention.
  2. The result is equal to or better than a result that was accepted as a new scientific result at the time when it was published in a peer-reviewed scientific journal.
  3. The result is equal to or better than a result that was placed into a database or archive of results maintained by an internationally recognized panel of scientific experts.
  4. The result is publishable in its own right as a new scientific result ¾ independent of the fact that the result was mechanically created.
  5. The result is equal to or better than the most recent human-created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions.
  6. The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered.
  7. The result solves a problem of indisputable difficulty in its field.
  8. The result holds its own or wins a regulated competition involving human contestants (in the form of either live human players or human-written computer programs).

The inaugural competition in 2004 awarded $5000 in prize money among six medalists.

In the interest of full disclosure, the writer of this post was a contest judge in 2004 and will serve in that capacity again this year.


 

RioRoboLab demos autonomous chopper

The New Mexico State University Roundup reports that members of the RioRoboLab showed a tethered helicopter during visitors day:
RioRoboLab director Ram Prasad said work on the autonomous helicopter began last semester as a Capstone Design effort under sponsorship of the U.S. Army White Sands Missile Range. A Capstone Design is an academic requirement for all graduating engineering students. Prasad said although helicopters are the most maneuverable aircraft, they are the most difficult to control when trying to provide stable flight characteristics. The objective is to implement technologies that are bio-inspired and can emulate human behavior to fly the helicopter. In response, students in the RioRoboLab have worked toward developing control systems to allow a model helicopter to attain autonomous flight.

Whether or not genetic algorithms or evolutionary computation were part of the demo is unclear from the article; however, the lab is a part of a larger Rio Grande Soft Computing Institute. The mission of the institute is "to develop and facilitate the application of innovative soft computing technologies for modeling, analysis, prototyping, manufacturing, testing and evaluation of dynamic processes and systems that have use in government and in industry."

Tuesday, February 22, 2005

 

The evolutionary music of E. R. Miranda

I was browsing the web for information on evolutionary music and came across the entries at Sony CSL Paris here and that took me to E. R. Miranda's web site at Plymouth here. Check out the cut Electroacoustic Samba II from his album Mother Tongue.

 

Flickr pool features evolutionary art

Flickr.com supports photo and image blogging, and a Flickr group devoted to Generative and Evolutionary Art has formed here. There's not many words on the site, so its hard to determine the techniques used on each image. Nonetheless, here is an image by killbyte called Pointy Rainbow.

 

GAs mentioned in networking book

ITworld.com has an interview with Max Goff, author of Network Distributed Computing: Fitscapes and Fallacies (Prentice-Hall) in which Goff mentions the possibility of evolving networks of distributed computers using genetic algorithms and genetic programming. He coins the term fitscapes as a contraction of the more familiar "fitness landscapes" in talking about distributed networks.

Althought Goff's argument is cast in evolutionary terms, his reasoning is more like that of Adam Smith in Wealth of Nations:
Goff: Well, it dawned on me that there were a lot of autonomous entities that were involved in the creation of software -- i.e., individuals and companies. But it also dawned on me that the good things that would probably happen with respect to software, if that fitscape model -- in other words, autonomous agents operating in their own best self interest -- if that sort of model were unleashed from a software perspective, that good things would happen. And part of what I discuss in the book are platforms that tend to give rise to that sort of organic behavior.

The only thing missing is the invisible hand. Although GAs, GP, and EC generally are useful technology, their influence as metaphor for reasoning about population-oriented systems is perhaps just as important.

Monday, February 21, 2005

 

Rube, Dave, what's the difference?

Someone searched on the term "Ruby Goldberg," but I think they meant Rube Goldberg, the cartoonist. I have a stash of Rube Goldberg postcards with reprinted cartoons I use as thank you notes, although to my knowledge there are no family ties between us. Here is Tee Up a Golf Ball. The groundhog is a nice touch.

Sunday, February 20, 2005

 

Folksonomy, John Holland, and DB design

Here is a white paper from a company called Ecosystems Design, Inc. that uses Hollandian ecosystems as a metaphor for database system system. The white paper is written by Len who then, writing at Life Among Mammals, uses Holland's bucket brigade here for some speculation regarding folksonomy.

 

Use GAs to identify dialects?

Don Schwartz speculates that a spatial GA might be used to analyze and identify the dialect of a speaker. There has been a fair amount of work on GAs and speech recognition and some on GAs and natural language processing. How would such an application really work?

 

Great workshops at GECCO-2005

The Genetic and Evolutionary Computation Conference (GECCO-2005) to be held in Washington, DC, 25-29 June 2005, (Saturday to Wednesday) has a terrific lineup of workshops:

A complete listing is available here. Workshop attendance is included at no additional charge in GECCO registration fees. A number of IlliGAL Blogging bloggers are workshop organizers. Perhaps some of them will tell us how their workshop plans are coming along.


 

Peirce, Burks, and GAs

I just saw Rob Smith's nice post on Peirce. Given the connections Rob makes, it should come as little surprise that genetic algorithms have an interesting intellectual lineage vis a vis Peirce. John Holland, genetic algorithm pioneer, was the first PhD in Communication Science (later Computer and Communications Science) at Michigan. Holland's dissertation advisor was Arthur W. Burks. Burks worked on the original ENIAC



(I believe Burks is the gentleman seated in the picture, but I can't confirm this in any of the web sources I could find) and founded Michigan's Logic of Computers Group. But Art Burks is also a Peirce scholar and edited the final two volumes of Harvard's Collected Papers of C. S. Peirce (see here) . Here is a listing of some of Art Burks's other books.

 

Confessions of a Teaching Company fan

This post doesn't have anything to do with genetic algorithms, but I was exercising this morning and watching Jules Schwartz of Boston University lecture on product life cycle in his course Finance and Accounting for the Non-Financial Manager offered by the Teaching Company. I've been a fan of the teaching company for maybe eight years. Back then, I was a runner (jogger really) training for marathons, and I would get Teaching Company audio tapes to keep me company during long training runs and learn new stuff. I've stopped pounding the pavements, and now exercise indoors, and I get TC videos to keep me company on my Precor elliptical trainer (sigh, if only IlliGAL Blogging readership warranted product endorsement fees). Anyway, if you've never taken one of these courses, do it. An early course in my listening was Great Minds of the Western Intellectual Tradition. It is now in its 3rd edition and is well worth the investment of time and money.

 

If you have to learn this stuff (and you do)

Just finished "Knowledge Representation: Logical, Philosophical, and Computational Foundations" by John Sowa. I know GA types aren't likely to be into logic-based representations, but this book offers a great persepctive on "neat" (rather than "scruffy") AI. The perspective is drawn from the wonderful (and sadly often overlooked) American philosopher C.S. Peirce.

I think Peirce (sometimes refered to as "The American Aristotle") is definately worth our looking into. He was saying things in the late 1800s that thinkers wouldn't revisit for 100 years. His concept of abduction as a third form of inference (in addition to the classical deduction and induction) is particularly interesting.

As Sowa describes Peircian abduction, it can consist of several iterated procedures, including:

"Reuse. Do an associative search for a predefined theory that can be resued for the current problem.
Revise. Find a theory that approximately matches the problem at hand and use belief revision techniques to tailor it for the current situation.
Combine. Search for scattered fragments of knowledge and perform repeated steps of belief revision to combine them into a complete theory."

I think this has some resonances for GAs and learning classifier systems.

Though Peirce saw abduction as a set of real inference procedures, he also noted (appropos to recent posts on random discovery):
""There is a more familiar name for it than abduction, for it is neither more nor less than guessing."

(Note that I will sometimes talk about Peirce on my blog, as will my friend Russell who writes tired fools. But, be warned, these are not strictly scientific or philosophical blogs, and some content may offend!)

Friday, February 18, 2005

 

Power-law networks and epistasis

Many natural networks such as molecules in a cell and people in a social group are so-called power-law networks. Power-law networks are quite stable and robust against random node failures. However, they are exposed to occasional catastrophic collapse by losing nodes with giant connections. If gene regulatory networks are also power-law networks, genomes have to prevent themselves from losing genes linked to many by recombination and mutations to survive. How do living things acquire evolvavility in such a situation?

 

Some serious EH blogging in Portugese

There's some serious blogging on evolvable hardware over at eyes wide open in Portugese.

Thursday, February 17, 2005

 

Gizoogling GAs

According to today's Wall Street Journal, google.com is available in over 100 languages, but not hip-hop slang. That stunningly insensitive marketplace void was recently filled by gizoogle.com. Doing a search on genetic algorithms on gizoogle (here) is enlightening dawg. Peace out.

 

The rise of academic blogging

Yesterday I posted the following:


Since IlliGAL Blogging opened up shop on 24 January, I've been a little surprised that the young avant garde of the EC world hasn't been blogging its brains out, but maybe I'm missing something.
To which, Amir at thesilog commented:

During my surfing in blogosphere, I have found few really technical weblogs. This rush of academic weblogs (in AI-related fields) is rather new or at least new in the eye of me! Most previous academic weblogs are something between daily life of an graduate student or professor and their research-related news, e.g. mine weblogs is a sample of it.
Yes, this is what I've found, and I think it is rather surprising. Blogging has been around for awhile. I would have thought that dozens of young faculty and hotshot grad students who already were blogging for personal reaons would have turned to academic blogging by now.

Amir continues

And more importantly, it would be nice to discuss about "what a weblog can do for us?". Is it a place to report, or a place to disuss, or place to fast-publishing? What about its formality? Can I claim that I have this XYZ idea if I do not publish a paper in a conference or journal and write a post in my weblog instead?
Others need to answer these questions for themselves, but my lab website has always been a place to disseminate work, inform, and influence. Blogging for IlliGAL Blogging is a way to do those things in a manner that can provide more continually updated content. Also the comment facility allows us to have useful online dialogue in a straightforward manner. Previous efforts at interacting with readers were too static or too annoying.

Moreover, I like the informality of blogging, but I've never been one to be overly formal in academic discourse either. This work is fun. I'm not sure why it is necessary to dress up the joy of discovery in stilted language and thereby turn it into a form of drudgery.

I do share Amir's concern with posting new ideas prior to publication. I don't believe blogging takes the place of scholarly publication, and we are holding new results from this site until they are published. Having said this, it has been a long standing policy at IlliGAL to publish papers as tech reports on the web immediately following paper completion. The public nature of that posting has made it difficult for others to claim our ideas as their own, and perhaps contemporaneous posting of new ideas to a blog could serve the same function. This might especially become the case if bloggers continue their habit of citing one another generously (as is the academic ideal).

If this were to take place, rapid exchange of new ideas in blogs could represent a new kind of open-source brainstorming, but I have difficulty believing this process will become the norm. It is hard to imagine a tenure and promotion committee ever looking favorably upon a series of blogposts as being sufficiently serious. On the other hand, much of what is going on right now on the web was difficult to imagine 15 years ago. What do IlliGAL Blogging readers think?

Wednesday, February 16, 2005

 

Notes 2 Self notes EC Blogs

Notes 2 Self notes lots of new EC blogs here. Has anyone got a listing of other EC-related blogs. Since IlliGAL Blogging opened up shop on 24 January, I've been a little surprised that the young avant garde of the EC world hasn't been blogging its brains out, but maybe I'm missing something.

 

Supergenes and competence

Javarunner continues to post about JGAP and evolutionary computation here, but digging below the surface shows an interesting post on supergenes. Supergenes in JGAP aggregate variables that are interrelated either from inspection of the set of equations being optimized or intuition. The supergene page claims the following:
The supergene conception can significantly (up to 3 times in the most boundary cases) increase the evolution speed. It can also increase the accuracy of solution for the small populations. However these effects are dependent from the chosen population size and maximal number of iterations. While in some cases the use of supergenes can be definitely sensible, we would recommend to try both supergene and non-supergene versions for your specific task.
Interestingly, IlliGAL work has demonstrated significant speed ups (sometimes going from exponential scalability to subquadratic) in hard problems in similar way, except IlliGAL competent GAs require no prior knowledge of the problem being solved or the variable interrelationships. This approach works, because initial results are used to determine which variables are interdependent, and then "supergenes" are automatically constructed to solve the problem quickly.

Similar automatic linkage detection or supergene detection could be built into JGAP and other traditional GAs so they automatically give faster, better solutions to hard problems without as much prior problem knowledge or understanding.

 

Script and program resources listed

at Marcus Zillman's Script Resources blog here.

 

Wash mouth out with GA SOAP

Jim Fuller's ruminations on XML, XSLT, and Ant posts a link to a ppt presentation on how to "create a GA process that will discover the XSLT program which taken a source.xml generates a target.xml."

 

Selfish gene in a rice plant

My elder brother who is studying genetics of rice plant told me that there exsist selfish genes in a rice plant. When some African and Japanese rice plants, which derive from the same genome, are crossed with each other, an only gene at some locus in the African rice plant always kills the Japanese rice plant, while a gene at the same locus in the Japanese rice plant does not kill the African rice plant at all. The gene in the African rice plant looks so selfish. His current working topic is to reveal how such selfish genes appear, though it is said that such genes appear by mutation after geographical isolation (niching) occurs to individuals. I am wondering if it is possible to utilize such mechanisms of species specialization in the context of genetic and evolutionary computation and also if it is possible to see such mechanisms or phenomena in human organizations.

 

Holland festschrift volume published

I received my shiny new copy of the edited volume Perspectives on Adapation in Natural and Artificial Systems from Oxford University Press today. Some of the contributors include Kenneth Arrow, Brian Arthur, Bob Axelrod, Lashon Booker, Michael Cohen, Stephanie Forrest, Doug Hofsatadter, John Koza, Melanie Mitchell, Rick Riolo, and Oliver Selfridge to name a few. The volume follows up on the 1999 festschrift in Ann Arbor honoring John H. Holland's many contributions to science and technology. My contribution is entitled, John H. Holland, Facetwise Models, and Economy of Thought. Only $54.50 in hardcover. Operators are standing by to take your call (actually servers are standing by to take your computerized order).

Tuesday, February 15, 2005

 

PostgreSQL uses GAs

I've been teaching an introductory database systems course for some years and we've been using PostgreSQL in class, an open-source relational database management system. PostgreSQL is a nice system and has its roots at UC Berkeley. I found out that it uses GAs for the SQL query optimizer.

 

Why blog? Hugh Hewitt's answer

Hugh Hewitt's recently released book Blog: Understanding the Information Reformation that's Changing Your World, follows the recent history of how blogs changed the course of the 2004 US Presidential election. He also considers the current information reformation as analogous to the Protestant one that followed the introduction of Gutenberg's press. The analogy, according to Hewitt, is about the rapid, qualitative shift in the speed of information dissemination (p. 55):

But Luther was living in a new day. Almost immediately after they were posted [Luther's 95 theses, which were originally written in Latin], someone, no one knows exact who. got hold of a copy of Luther's theses, translated the Latin into German, and published them. Thanks to Gutenberg, Luther--and more important, his ideas--were known all over Germany within two weeks, and all over Europe in a month.

The book has useful chapters on the nuts and bolts of blogging, blogging as an organizational tool, and the kinds of blogs Hugh would like to see started himself. His list of do's and don'ts is helpful:
The key rules of blogging success and significance are these:
  • Post often.
  • Link freely.
  • Be generous in praise and attribution.
  • Don't be long-winded too often, if at all. Brevity is the soul of blogging when you are getting started.
  • Paragraphs are your friend.
  • Profanity loses audiences.
  • Avoid feuds and flame wars.
  • At least at the start, skip the comments sections. You end up with the problem of nuts if you are any good.
  • Keep the title short and easy to remember so that it is easy to recall and type into the space at the top of the page.
In the main, this sounds like good advice to me.

 

TDD, GP & testing in general

There is a post at javarunner linking TDD or test-driven development and genetic programming. Of course, the idea of using GAs in test generation goes back a ways, but the post seems to be pushing this in a more philosophical direction.

 

Open source search engine for Java GAs

The search engine koders.com specializes in searching for open source software. Try the search for genetic algorithms in Java here. Hat tip livejournal.

 

GAs as a labor of love

ForwardMarkets has an interesting post about personal motivation and AI (in particular, GAs). AI and GAs were a labor of love in his case, but they are for most of us. GAs were/are CP-cool and CP-easy (cocktail party cool to talk about and fairly easy to convey the main ideas), but they are engaging enough to get totally lost in. You like analysis, have we got analysis to do. You like design, have we got creative ways to fiddle with mechanism. You like applications, the spectrum of human endeavor is your oyster. And if you have a contrarian streak, you get to rebel against the received CS/AI wisdom and be on the edge, although (hopefully?) some of that is going away (see SIGEVO: From Outlaws to Mainstream post here).

Monday, February 14, 2005

 

Fourteen days to D-day

The DISCUS experiment is getting closer and closer. As mentioned in a previous post, the purpose of this survey is to understand how and why a consumer uses each new media and communication means such as the Internet, email, and cell phones in daily life. By this study, we hope to find more efficient and comfortable ways and media of communication toward the near future. These marketing research is conducted by researchers of Hakuhodo Inc., Illinois Genetic Algorithms Lab, and Chance Discovery Consortium using cutting-edge innovation and creativity support tools for computer-mediated communication developed as part of the DISCUS project.

In the first four days we already collected 20 questionnaires. Our goal is to reach 200 questionnaires by the February 23, and invite 40 participants to join the focus groups. If you are UIUC student or affiliate and want to participate, just send a mail to discus@illigal.ge.uiuc.edu. Surveys can be filled until February 23, and selected candidates will be asked to participate on focus group activities the week from February 28 to March 4.

 

Ying-ping Chen directs NCTU Natural Computing Lab

IlliGAL alum and IB (IlliGAL Blogging) blogger Ying-ping Chen is directing the Natural Computing Laboratory at National Chiao Tung University in Hsinchu City, Taiwan. Y-P is known for his work on analysis and extensions to the linkage-learning GA. He also has extensive experience in practical experience in using GAs in the real world.

 

The abstract of my PhD thesis

Hello,

I am posting the abstract of my recent PhD thesis, which was greatly improved from by visit to IlliGAL last spring.

Title: Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time

Abstract:


Pittsburgh genetics-based machine learning (DeJong, Spears, & Gordon, 1993) is, among others (Wilson, 1995; Venturini, 1993), an application of evolutionary computation techniques (Holland, 1975; Goldberg, 1989a) to machine learning tasks. The systems belonging to this approach are characterized by evolving individuals that are complete rule-sets, usually variable-length. Therefore, the solution proposed by these kind of systems is the best individual of the population.

When using this approach, we have to deal with some problematic issues such as controlling the size of the individuals in the population, applying the correct degree of generalization pressure across a broad range of datasets, reducing the considerable run-time of the system, being able to solve datasets with diverse kind of attributes, etc. All these issues become even more critical when applied to modern-day data mining problems.

In this thesis we have the general objective of adapting the Pittsburgh model to handle successfully these kind of datasets. This general objective is split in three: (1) Improving the generalization capacity of the model, (2) Reducing the run-time of the system and (3) Proposing representations for real-valued attributes. These three objectives have been achieved by a combination of four types of proposals:

- Explicit and static default rules
- Windowing techniques for generalization and run-time reduction
- Bloat control and explicit generalization pressure techniques
- The Adaptive Discretization Intervals rule representation for real-valued attributes

Some of these proposals are focused only on a single objective, some others solving partially more than one objective at the same time. All these proposals are integrated in a system, called GAssist (Genetic clASSIfier sySTem).

An experimentation process including a wide range of data mining problems based on many different criteria has been performed. The experiments reported in the thesis are split in two parts. The first part studies several alternatives integrated in the framework of GAssist for each kind of proposal. The analysis of these results leads us to propose a small number of global configurations of the system, which are compared in the second part of the experimentation to a wide range of learning systems, showing how this system has competent performance and generates very reduced and interpretable solutions.



As one of the topics of my reseach is the use of default rules, I am very interested in the work of Rob Smith in this topic.

 

Separating fitness calculations from algorithms

Some time ago I was in a situation, where I needed to interface the algorithm I was working with (NSGA-II) to Matlab in order to perform some fitness calculations. In my attempts to find an easy way of doing this, I stumbled upon a small "server"-program which allowed me to use files for inputing and outputting my data from Matlab. This inspired me to modify the algorithm that I work with, such that it can send the individuals to external programs and read the calculated fitness values once they appear. The principle is very simple. By using one input file and one output file it becomes possible to use a "handshaking"-signal. The algorithm can signal that the file containing the individual is ready by deleting the file containing the previous calculated fitness functions. Correspondingly the fitness calculation program can signal that the calculated fitness is ready by deleting the file containing the individual. This method has allowed me to change the way fitness calculations is performed without the need for modifying or recompiling my algorithm. It is thus possible for me to use Matlab to calculate my fitness functions for some problems while using a C++ program, that uses database queries, to calculate other fitness functions. So, using this trick makes it possible to make the algorithms versatile by simply separating the algorithm from the code that is necessary to calculate the fitness.

Sunday, February 13, 2005

 

GAs dissed at remarkably unreactive

Blogger remarkably unreactive spends much of a remarkably uninformed post dissing GAs (and the engineers who make and use them). If only he knew about the many cool things GAs and GP are doing in his own field. I suppose I could post a set of interesting links to chemistry and material science applications of GAs and GP. Nah. Let him find his own way into that growing literature.

 

Ian Clarke is playing with GAs

here.

 

Evolutionary art featured at The Big Picture

Vik Rubenfeld's The Big Picture has a nice post on evolutionary art.

 

Rob Smith posting

I'm glad to see Rob Smith and his maiden posts on IlliGAL Blogging. Rob did the first post-renaissance thesis on dynamic environments and GAs in 1987 and his PhD dissertation did a lovely job of trying to advance Holland's notion of a default hierarchy by creating an adaptive scheme to properly place rules in an effective hierarchy without prior assumptions about specificity or the like (Alert, Alert, Alert: This work should be revisited by GBML types. It has been wrongly overlooked, mainly because it was about a decade ahead of its time). Since then, Rob has held a variety of positions in the US and the UK. He currently is at the University of Western England (UWE), and his CV and publications list are available here. One of his most often cited works is his application of learning classifier systems for learning high-performance fighter jet maneuvers.

 

Chips that thrive on uncertainty

It's interesting to note that uncertainty may become a part of future chip designs, in order to increase speed and reduce power consumption, as reported in the BusinessWeek article below (via /.). Could this herald a new era for algorithms that have been using stochasticity for decades?

Chips That Thrive on Uncertainty

Saturday, February 12, 2005

 

Epistasis Blog is blogging about..

epistasis here. Would someone please connect biological concern with epistasis to GA concern for problem difficulty here.

 

Breeding Formula One cars with genetic algorithms

Kryzsztof Wloch and Peter Bentley at University College London have used genetic algorithms to optimize the performance of Formula One race cars. Here are some excerpts from an article in wired news:
Formula One teams pride themselves on their mechanical tweaking skills. But the Digital Biology Interest Group at University College London discovered that they can boost performance by using computers to "breed" the cars.

But there was no dating, no wooing, not even a messy oil wet spot in this survival-of-the-fastest experiment. The breeding was done solely with computer-generated simulations using genetic algorithms—programs that combine Mother Nature's laws and computer science to mimic the natural process of evolution. ....

For the race-car research project, likely car designs were generated and then tested using a racing simulation designed by Electronic Arts, with virtual replicas of various Formula One racetracks. The researchers configured 68 parameters in the simulation car, which affected suspension, engine performance, tire and brake pressure, fuel consumption and steering control. ...

Bentley said some of the cars that evolved were "clearly on the limits of drivability—only the computer or Michael Schumacher could have driven a car set up in some of the solutions." According to simulations run with the best and most drivable cars, on various tracks, it is possible to shave 88/100th of a second per lap using genetic algorithms to tune the cars. In an industry where 1/100th of a second really matters, that's significant.
The above story has also been picked up by New Scientist, Science Blog, Scenta, and Innovations Report. A technical paper related to the topic is available here.

 

Bird brains

I think this may be appropos of some of the recent discussion on chance discovery. Moreover, it's from the really wonderful ScienceBlog, which I think many IlliGAL readers would enjoy.

Bird Brains Show How Trial and Error May Contribute to Learning Science Blog: "Bird Brains"

 

Co-evolutionary wiki up and running

Paul Wiegand noticed IlliGAL Blogging and asked me to post an announcement for the Co-Evolutionary wiki that is up and running, so here it is:

Dear colleagues,

At GECCO-04, an informal coevolution discussion was held where many of the terms used in coevolution were discussed. As a result of this, and after substantial further discussion via email, we have set up aCoevolution Wiki which offers descriptions of several main terms. Naturally, all of these descriptions are subject to debate, and the wiki therefore features a dedicated discussion page for each term. The Coevolution Wiki can be visited here:

http://www2.demo.cs.brandeis.edu/cgi-bin/coec-wiki

Contributions to this online discussion, or any other comments, are very welcome!

Signed, Anthony Bucci, Edwin de Jong, Anthony Liekens, Paul Wiegand


I took a look, and the coec-wiki has made a nice start. Go peek yourself.

Blogging and wikis are important content management tools that should receive greater use in the community for information dissemination and exchange. I applaud this and other efforts in the same vein. We may be a little behind the wiki curve in my lab, but we are now using one for internal lab information exchange online. Some of the free tools are quite sophisticated and good.

 

GA efficiency in 4-part harmony

In a recent post I discussed the term competence in connection with GAs. The term is used to refer to well-designed GAs that solve a large class of hard problems quickly, reliably, and accurately. The bottom of line of designing competent GAs to solve problems in the class of problems Herb Simon called nearly decomposable is that you end up with solvers that might be as good as subquadratic, that is, they give good solutions in times that grow no more quickly than a quadratic function of the number of decision variables (in a relaxed PAC sense).

Polynomial time is usually good news and cause for celebration, but if you are solving big ole problems, the square of a large number is darn huge (yes, Virginia, these are technical terms I'm using here). The square of a thousand is a million, which means that you might have to go off and run 1000s of function evaluations in large problems, which may be impractical if your eval is an expensive simulation, computation, or data-crunching task.

Thus, we see a problem with stopping at seeking GAs that are merely competent. Yes, competence takes us from solutions that are worst-case exponential to polynomial (from intractable to tractable), but to go from tractable to practical requires that we pay attention to enhancing the efficiency of effective (competent) GAs.

At IlliGAL we do this with a four-part decomposition of the efficiency problem:
  1. Parallelization
  2. Time continuation
  3. Evaluation relaxation
  4. Hybridization

Erick Cantu-Paz's work was our first principled foray into the issue of parallelization. My paper on time continuation set the stage for Ravi Srivastava's thesis and Kumara Sastry's work with me on BB-mutation. Kumara's MS thesis and Laura Albert's thesis set the stage for our current thinking in evaluation relaxation. Work with Siegfried Voessner in 1999 laid down a barebone's framework for thinking about global-local hybrids and was followed up by Abhishek Sinha's MS thesis. Many of these items are cited in the references of The Design of Innovation (Google searchable version here).


 

GA-created music

As long as I am blogging about stuff (like art) that was once believed to beyond the reach of computation, I should mention evolutionary music. Al Biles, a frequent contributor and attendee of the GECCO conference, has a nice summary of his work here, along with a bibliography of evolutionary music more generally here.

 

Genetic and evolutionary art

A stock phrase in some of my writing has gone something like the following: "Genetic algorithms are used across the spectrum of human endeavor," and some have criticized that sentence as overstated, but compared to other computational procedures, GAs have found their way into nooks and crannies of human thought that have previously resisted computational aid. One area where this is particularly true is in the realm of visual art. Since the early 90s, a fair number of people have used genetic algorithms to evolve art. A nice survey site is here. A page of Steven Rooke's work is here, and a site citing Latham's work is here. Whether you find the art products pleasing to your eye is not really the issue. That a computer program was an active participant in the act of creation is.

Friday, February 11, 2005

 

Announcing my Ph.D Thesis

Hello.
I am posting an abstract of my dissertation.
Even though the thesis is not available on the online,
some related papers can be obtained.

1. A Genetic Algorithm for Shortest Path Routing Problem and the Sizing of Populations
2. Elitism-based Compact Genetic Algorithms
3. Real-coded Bayesian Optimization Algorithm: Bringing the Strength of BOA into the continuous World

Thank you.


Title: Theory, Design, and Application of Efficient Genetic and Evolutionary Algorithms

Abstract:
The goal of this dissertation is to develop efficient optimization algorithms to solve diverse real-world problems of graded difficulty. Genetic and evolutionary mechanisms have been deployed for reaching the goal.

This thesis has made five significant contributions.

Practical guidelines for developing genetic algorithms (GAs) to solve real-world problems have been proposed. This fills a long standing gap between theory and practice of GAs. A practical population-sizing model for computing solutions with desired quality has also been developed. The model needs no statistical information about the problems. It has duly been validated by computer simulation experiments.

The suggested design-guidelines have been followed in developing a GA for solving the shortest path (SP) routing problem. Experimental studies validate the effectiveness of the guidelines. Further, the population-sizing model passes the feasibility test for this application. It appears to be applicable to a wide class of problems.

Elitist compact genetic algorithms (cGAs) have been developed under the framework of simple estimation of distribution algorithms (EDAs). They can deal with memory- and time-constrained problems. In addition, they do not require any prior knowledge about the problems. The design approach enables a typical cGA to overcome selection noise. This is achieved by persisting with the current best solution until, hopefully a better solution is found. A higher quality of solutions and a higher rate of convergence are attained in this way for most of the test problems. The hidden connection between EDAs and evolutionary strategies (ESs) has been made explicit. An analytical justification of this relationship is followed by its empirical verification. Further, a speedup model that quantifies convergence improvement has also been developed. Experimental evidence has been supplied to support the claims.

The real-coded Bayesian optimization algorithm (rBOA) has been proposed under the general framework of advanced EDAs. Many difficult problems -- especially those that can be decomposed into subproblems of bounded difficulty -- can be solved quickly, accurately, and reliably with rBOA. It can automatically discover unsuspected problem regularities and effectively exploit this knowledge to perform robust and scalable search. This is achieved by constructing the Bayesian factorization graph using finite mixture models. All the relevant substructures are extracted from the graph. Independent fitting of each substructure by mixture distributions is then followed by drawing new solutions by independent subproblem-wise sampling. An analytical model of rBOA scalability in the context of problems of bounded difficulty has also been investigated. The criterion that has been adopted for the purpose is the number of fitness function evaluations until convergence to the optimum. It has been shown that the rBOA finds the optimal solution with a sub-quadratic scale-up behavior with regard to the size of the problem. Empirical support for the conclusion has also been provided. Further, the rBOA is found to be comparable (or even better) to other advanced EDAs when faced with nondecomposable problems.

Finally, a competent multiobjective EDA (MEDA) has also been developed by extending the (single-objective) rBOA. The multiobjective rBOA (MrBOA) is able to automatically discover and effectively exploit implicit regularities in multiobjective optimization problems (MOPs). A selection method has been proposed for preserving diversity. This is done by assigning fitness to individuals by domination rank with some penalty imposed on sharing and crowding of individuals. It must be noted that the solution quality is not compromised in the process. It is experimentally demonstrated that MrBOA outperforms other state-of-the-art multiobjective GEAs (MGEAs) for decomposable as well as nondecomposable MOPs.

It is felt that this work will have a major impact on future genetic and evolutionary computation (GEC) research. Our ardent hope is that it will play a decisive role in bringing about a paradigm shift in computational optimization research.

 

Biota.org, darwin@home & the Alive Prize

Picked up on biota.org from ruminator's ilk, which led me to Darwin at Home which has a list of advisors. Here is the project mission statement:

It is our hypothesis that compute space is now or soon will be sufficiently rich and complex to support a reasonable "lifelike" simulation of the processes and products of evolution.

The Darwin@Home project is a challenge to multiple, independent teams to construct platforms in software, hardware or a combination, to test this hypothesis. In recent years, several platforms have been built that suggest that this goal is attainable. We believe that by pooling efforts and creating a shared community of interest, we will quicken the journey along the path of innovation.

The project has been picked up in NewScientist.com and eventually the project would like to offer a prize similar to the X prize:
A long term goal of Biota.org has been to create an international prize competition called the AlivePrize. Darwin@Home is a first step along that road by encouraging the community of people developing platforms and providing them resources and intellectual contributions. In a couple of years after the Darwin@Home efforts have matured, we will pursue the goal of financing and managing a competitive prize modeled after the Ansari X-Prize and the DARPA Grand Challenge.

I wonder whether this sort of thing is helpful to the field or merely shameless self-promotion and grandstanding. But perhaps some of you are wondering the same thing about this blog.

 

Is your GA competent?

At IlliGAL, we use the term competence to refer to principled procedures that solve a large category of hard problems quickly, reliably, and accurately. By principled, we mean that there exists theory that gives us reason to believe that the procedure will work and scale well over the whole class, but we stop short of requiring a convergence proof (see my burning GAs and toaster convergence rant) and accept dimensional arguments bolstered by facetwise or other simplified models. The quick, reliable, and accurate part is quite similar to the usual requirements of PAC learning. We want polynomial time performance, that is probably approximately close to being accurate. For hard problems we look for boundedly difficult problems along dimensions of difficulty that challenge our procedure. This is a game theoretic argument. If we understand the dimensions of problem difficulty, and we can solve problems on some farout boundaries of those dimension in a sufficiently quick, reliable, and accurate fashion, we should be able to solve easier problems as fast or faster as those on the edge of our design envelope. The end result are procedures that scale nicely (subquadratically as problem size grows) whereby GA parameters can be set in a principled manner to give reliable and accurate solutions.

 

IlliGAL Blogging rolls expand

IlliGAL Blogging welcomes more alums and affiliates to our roll of bloggers. Yukio Ohsawa (yukio) is a faculty member at Tsukuba University just outside of Tokyo and spearheads reseach in chance discovery. Naohiro Matsumura (nao) is on the staff of Osaka University and is pursuing very interesting reseach using the influence diffusion method (IDM). Long-time GPer Una-May O'Reilly works in Rod Brooks's robotics lab at MIT and is current chair of the GECCO conference. Welcome, welcome. Now post, post.

 

Don't code, evolve

Yesterday I was chatting with my friends, and I mentioned about genetic programming (GP). Interestingly, they thought this kind of things only happen in sci-fi movies. "Let programs write programs? Are you serious?" they said. No, my friend, I was not kidding. It's there and it's been developed for years. Professor John Koza, who invented the term and has been devoted himself to this area, has published a nice series of books in GP. Anyone who is interested in GP should definitely read these books.

The basic idea in GP is that, instead of evolving solutions to optimization problems, now the operand is programs (mainly encoded in tree structures like LISP). In our lab, Kumara Sastry applies GP to optimization problems in material science. Along the way, he also contributes in GP theory. One of his goals is to develop a competent GP> which respects the linkage in programs. As you can imagine, one of the main challenges here is to properly define linkage and building blocks in GP.

Can you imagine that someday we no longer need to code, and computers will do that for us? Yes, there is still a long way to go, but I believe we are not too far from it.

Thursday, February 10, 2005

 

More Georges

More on IlliGAL alum Georges Harik and Gmail from BBC News here.

 

Georges Harik, IlliGAL, and Google

During the mid-90s, Georges Harik visited the Illinois Genetic Algorithms Lab for an extended period as part of his PhD studies at the University of Michigan. He did a lovely thesis on the linkage learning GA, and did a number of other very creative things, including the gambler's ruin problem population sizing, compact GA, and extended compact GA. Georges left the lab to make his mark in commerce, first working for a data-mining group at SGI and then taking a position as one of the early employees at Google in early 1999. Today, Georges is Director of Googlettes as part of Google. Read and listen to his interview on WebTalkRadio about his role in Gmail.

 

Genetic algorithms for automatic map labeling

I was cleaning my desk a couple of days back and in the process came across Steven van Dijk's PhD thesis "Genetic Algorithms for Map Labeling" (Thanks Steven for mailing it to me). Its a very good piece of work, and in my opinion hasn't gotten the attention in the GA community that it deserves. In his thesis, Steven uses facetwise models and design-decomposition principles pioneered by Dave Goldberg to design efficient and scalable GAs for map labelling and other search/optimization problems in geographical information systems (GIS).

A python source code for positioning non-overlapping labels using genetic algorithms is available here.

 

Chance discovery, marketing, and focus groups: DISCUS is comming

The DISCUS project is approaching to a big milestone. During the last fall visit to Urbana-Champaign of our research colleagues from Hakuhodo, Tsukuba University, Osaka University, and the Chance Discovery Consortium, we agreed we needed to have common users involved in our research endeavor. They are the final users of the DISCUS tools, thus, the sooner we get their feedback the better.

We have been designing a big experiment to validate some of our theoretical assumptions and results on innovation and creativity support, and influence diffusion in web-centric social networks. From February 28 to March 4, several focus groups will be using the state-of-the-art DISCUS platform. The goal, to discus about media environments under the guidance of our Hakuhodo colleagues. Around forty, non-DISCUS related, potential customers will form several focus groups to discus about different media environments and how they use them. Besides the intrinsic marketing interest of such experiment, this first big stress test will help us to see, in real time and close to reality, the performance and feedback of the innovation and creativity support tools of DISCUS.

 

New IlliGAL technical reports

IlliGAL is pleased to announce the publication of following technical reports:

Matsumura, N., Goldberg, D.E., Llorà, X. (2005). Mining Social Networks in Message Boards. IlliGAL Report No. 2005001. (Abstract) (Full paper in PS) (Full paper in PDF).

Lima, C. F., Sastry, K., Goldberg, D. E., Lobo, F. G. (2005). Combining Competent Crossover and Mutation Operators: A Probabilistic Model Building Approach. IlliGAL Report No. 2005002. (Abstract) (Full paper in PS) (Full paper in PDF).

Sastry, K., Abbass, H. A., Goldberg, D. E., Johnson, D. D. (2005). Sub-structural Niching in Estimation of Distribution Algorithms. IlliGAL Report No. 2005003. (Abstract) (Full paper in PS) (Full paper in PDF).

Sastry, K., Pelikan, M., Goldberg, D. E. (2005). Decomposable Problems, Niching, and Scalability of Multiobjective Estimation of Distribution Algorithms. IlliGAL Report No. 2005004. (Abstract) (Full paper in PS) (Full paper in PDF).

Pelikan, M., Sastry, K., Goldberg, D. E. (2005). Multiobjective hBOA, Clustering, and Scalability. IlliGAL Report No. 2005005. (Abstract) (Full paper in PS) (Full paper in PDF)

Other IlliGAL technical reports and publications are available here.

Wednesday, February 09, 2005

 

Genetic algorithms walking back to their source

There is no point in saying that genetic algorithms were inspired on biological grounds. This is not new. However, currently there is a pushing effort to explore how evolutionary method may be able to go back to the source and help solving hard problems, closing an interesting loop. There are several applications of evolutionary computation in the bioinformatics and biochemist, to mention a few, closing a thirty years loop. Here are two links to interesting results to real applications and results of using evolution as design tools. The first one shows the virtual images of peptides designed using genetic algorithms to block pathogen interactions between proteins---I have been involved in this work. The second links is an interesting paper on another approach proposed to evolving molecules for drug design by James Foster.

 

GAs picked up in mainstream blogs

The Technology Review story on GAs is getting picked up in a variety of blogs, including non-geek blogs. The Big Picture by Vik Rubenfeld is the latest to do so.

 

Creative evolutionary design tools

This is a small follow up on my last post about creativity, genetic algorithms and architecture. Leandro Madrazo emailed me about the post on evolutionary methods and architecture. The truth is that I really appreciate him spending time on this blog. Again, if you are interested on collaborative human-based genetic algorithms and architecture, you may want to a closer look to his work. Another must read work on the convergence of evolutionary computation and creativity, with again some applications to architecture, is the one by Una-May O’Reilly. She has been working on very interesting applications of genetic programming and artificial life. You can check it out here.

Tuesday, February 08, 2005

 

Take a chance on chance discovery

I was recently asked to write a foreword for a new volume of papers on chance discovery. In particular, Readings in Chance Discovery, Akinori Abe and Yukio Ohsawa (editors), ISBN 0-9751004-8-3, will soon be published in the International Series on Advanced Intelligence (Series Editors: R. J. Howlett and L. C. Jain) . I wrote the following:

I first became familiar with the term "chance discovery" on a trip to Japan in December 2001. I was invited to give a series of lectures on genetic algorithms and engineering leadership at the Graduate School of Systems Management of Tsukuba University by my good and longtime colleague Takao Terano. One of the hosts for the visit was a relatively new faculty member named Yukio Osawa, and when I arrived, he filled my ears with the glory of chance discovery. Unfortunately, during that first meeting, Dr. Osawa's zeal for chance discovery went in one of my ears and out the other, and if the situation had not changed, this story would have had an uninteresting ending. But Dr. Osawa is nothing if he is not persistent, and he continued to regale me with tales of chance disovery accomplishment, and somehow he got me to come back to Japan and give a tutorial with him merging GAs and CD topics into one program.

At that second meeting, my ears and my mind opened up, and I came to realize the importance of CD as a subject. Simply put, where much work in data-mining makes hay from high probability co-occurences, CD uses a variety of techniques to elevate and study the unlikely. In so doing, chance discovery focuses on phenomena that may be important in the future, on phenomena that may be an underlying and unrecognized cause, or important background phenomena that just plain deserve further exploration and explanation. As a result, chance discovery appears central to better, more mechanistic, understanding of creativity, smart mobs (to use Rheingold's term), and, more generally, the unexplained.

Since that second meeting, I have become a fan and sometimes practitioner of CD and its extensions. My own work on on collaborative systems couples genetic algorithms (regular, interactive, and human-based) with Keygraph chance discovery to help support organizational innovation (see http://www-discus.ge.uiuc.edu/). Chance discovery continues to chug along as a field in Japan, and the current volume's geographical representation shows that CD is becoming (has become?) a
scientific topic without borders.

In short, the current volume advances the state of chance discovery art, in philosophy, in theory, and in practice. For those who are familiar with chance discovery and uses, it is an indispensible guide to where CD research is and is going. To those who are unfamiliar with the topic, I recommend it as an entry point to an important area. Either way, I urge you to pick up this volume read it, use it, and don't be like me and take another eight months to pay attention and get involved.

I mean it. Go read something about chance discovery.

Monday, February 07, 2005

 

Catching up on GAs

Thanks to ForwardMarkets for noticing IlliGAL Blogging. The post mentioned the value of this site in catching up on GAs since the publication of my 1989 book, Genetic Algorithms in Search, Optimization, and Machine Learning (GASOML). In 2002, I completed the now Google searchable The Design of Innovation: Lessons from and for Competent Genetic Algorithms (DoI). The aim of the volume was to fill a void between largely empirical studies of this, that, or the other new operator and purely theoretical studies using lovely mathematics that have very little to tell us about the design of GAs that really work. The approach of the book in understanding fundamental facetwise and patchquilt models has been extended to understanding multiobjective GAs, classifier systems, and organizational size. Such usable theory may answer exactly the kinds of questions IlliGAL Blogging readers have been accumulating since the publication of GASOML.

This page is powered by Blogger. Isn't yours?