Randomly generating source code

Thursday, August 21, 2008


So this idea has been circulating in my mind for at least a few weeks now; if you wrote code that spit out source code randomly how long would it take to come up with something that actually works and more importantly what would it do? This of course is basically derived from the Infinite Monkey Theorem which states

a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare.

I've been doing some preliminary work on figuring out the logistics of actually generating random code and of course first i have to have a way to collect and hold a large amount of data - namely some or all of the functions in the desired language, in this case PHP. A daunting and likely resource intensive task lies in front of me starting with building an array of every PHP function in their documentation.

From there I have a question I need to answer in terms of direction for this endeavor, namely how random is random enough? To put a finer point to things I have to decide if storing the types of arguments that each function requires would remove too much of the sense of hit-or-miss from the project by eliminating a decent portion of potential errors. Passing only strings to a function that needs say an array, maybe a Boolean and perhaps a hand full of integers as well would no doubt send the interpreter screaming into the void of error. Since I'm doing this for the pure fun of it I don't have to satisfy some outside force's rules and regulations so naturally I'm leaning towards keeping specifics of what arguments each function needs, the arguments themselves can still contain random data of the proper type to ensure things are still a crap shoot.

All the technical talk aside the concept of randomly generating source code interests me because unlike brute forcing a password the outcome is mostly unknown; when using this approach on passwords you keep throwing stuff at the problem till you find the right combination, in the case of Infinite Monkey Development (if I may be so bold as to mutilate the phrase) you are throwing code together until something sticks in an at least semi-coherent fashion. In all likelihood the output code will almost always be total junk and not even parse correctly let alone run and do something that I could comprehend. That is not the point however, the idea is to keep running with it until something is born out of the primordial soup of segfaults, parser errors and other miscellaneous screw ups. Whatever is born will no doubt be gimped at best but it is something that can be worked with, studied, refined and ultimately exploited hopefully for the profit of all mankind or at least myself!

Labels: , , ,


posted by dword at 4:51 PM | Permalink |

[ back home ]

Comments for Randomly generating source code