|
How Jazillian Translates C To Java Code
Jazillian takes the source files
that you provide and applies a series of transformation rules
to convert various C constructs and patterns into their Java
equivalents, generating “natural” code. For more information,
see How Jazillian Translates Legacy Code to Java Code.
Example 1: "Hello, World"
The classic C program:
main(char *argv[], int argc) {
printf("Hello, world!\n");
}
Becomes the classic Java program:
public class Hello {
public static void main(String[] args) {
System.out.println("Hello, world!");
}
}
In this case, Jazillian applied the following "rules":
- Assuming the file was called "hello.c", The Java file "Hello.java" was created.
- The signature of the main() function was changed.
- The printf() call was changed to a System.out.println() call.
- The function was enclosed inside a new "Hello" class.
The key to Jazillian is that
it produces not just correct, but reasonable Java code.
Each of the three "rules" applied here illustrates this.
Jazillian makes reasonable assumptions about what programmers do.
You can't tell from this example, but Jazillian
translates each and every C main() function into the
"real" Java main(). Though this is not technically always
correct (for example, you may have a function that you happened to call
"main" that takes just a single "int" argument).
However, in the real world, a "main()" function is almost always
meant to be the function that's invoked on running the application.
Jazillian takes steps to generate realistic code.
You may have noticed that the "printf rule" did not produce:
System.out.print("Hello, world!\n");
That would have been perfectly valid,
working Java code. However, it is not "real" Java code.
A real Java programmer would not have embedded the (platform-specific)
newline in the text, and instead would write:
System.out.println("Hello, world!");
So the "printf rule" was smart enough
to see that the last argument ended with a newline character,
and so it should remove the newline character and use "println"
instead of "print".
Jazillian makes reasonable guesses.
How did Jazillian know
to call the class "Hello"? It took the name of the file
that contained the text, in this case "hello.c", converted
it to the Java class naming convention ("Hello"), and Jazillian knows
that each Java file name must match the class that it contains.
"Hello" may not be the perfect name for this class,
but it's as good a guess as a machine can do.
So even with the simple classic 3-line "Hello
world" program, each of the rules that were applied used
various "real-world programming" heuristics into account
to produce "real-world" Java code. It was not enough
to produce correct, working code. In fact, it will not always
produce correct, working code (for example, you may have multiple main()
functions, or had two "hello.c" files). But the
code looks like it was written by a good Java programmer,
not by a machine. This philosophy of "use heuristics to
produce realistic code" is inherent in the design of Jazillian
and most of its rules.
Example 2: strcpy() vs. String Assignment
The "Hello, World" example is pretty trivial,
so here are a couple of better examples. The first example
has to do with Strings. In C, programmers tend to copy strings
a lot with strcpy(), while Java programmers very rarely
make a copy of a String. That's because Java String objects
are immutable: they can't be changed. So most Java programs tend
to have many different String variables all pointing to the
same memory location, because none of them can change the
characters that are stored there.
So one of the Jazillian rules translates
a "strcpy() call":
char *hello = "hello";
char *greeting = strcpy(hello);
...to just a Java assignment with causes
two String variables to point to the same place:
String hello = "hello";
String greeting = hello;
Obviously, having two locations in memory,
each with the characters "hello" in them, is different than
having two pointers pointing to a single memory location containing
the characters "hello". I won't go into the details, but you
can surely see how the behavior of the Java code might differ
from the C code. But the good news is that the behavior may
not differ. In fact, Java programmers are often pleasantly surprised
at how rarely they really need to make multiple copies of Strings.
I dare say, it's almost never. And there's more good news: surely
some other rule must do something when it encounters
the C code like:
greeting[4] = '\0';
In fact, in this case, some other rule
will set 'greeting' to some new String value, indeed without
changing the value of the 'hello' variable. But your mileage
may vary.
The point here is that Java programmers just
don't go around copying Strings all the time. Producing Java
code that does this would simple be producing "C code with
Java syntax", and would nullify some of the benefits of moving to Java
in the first place (such as not having to keep calling strcpy()
and not wasting memory and CPU time making lots of copies of Strings).
Example 3: Error Handling vs. Exception Handling
Another example of cases where not-perfectly-working
code could be generated has to do with error handling. Many
(maybe even most) C projects follow a convention of having
every function return a flag to indicate whether the function actually worked
or not. Java programmers do not follow this convention,
and instead use the Exception feature for handling errors.
So this C code:
struct person p = malloc(sizeof struct person);
if (!p) {
printf("malloc failed");
}
...becomes this Java code...
try {
Person p = new Person();
}
catch (OutOfMemoryException e) {
System.out.println("malloc failed");
}
Many C library functions return an error
flag ("errno"), and your C code checks for those errors. One of the transformation
rules knows all the standard C functions and the error values
they return (for example, fopen returns 0 on error, while open
returns -1). This rule will detect that you are checking for
these error conditions and place your error-recovery code in
a catch block. This is all well and good, no problems so far.
If you're not a Java programmer yet, brace
yourself: Java programmers almost never bother to catch OutOfMemoryException.
Horror of horrors! "We were always taught to always
check the return value of every function call!" you're
yelling. Well, I'm sorry to break it to you, old timer, but
these lazy kids today just don't bother to clutter their code
with all those checks. You can like it or not, but that's
the reality.
So the Jazillian rule that replaces error-checking codes
with exception handling in fact will just discard code that
handles a malloc() failure. I'm sorry, but that's what real
Java programmers do, and that's one of the many little reasons
why Java code is a lot more readable and maintainable.
|