Sepulkarium: programming

Showing posts with label programming. Show all posts

Monday, November 25, 2013

A golden rule of performance optimization

Of course, one shouldn't optimize what doesn't need optimization, so one should use profiler to see what's slow. At least YourKit allows for two kinds of performance profiling: sampling and tracing. The first one reports which code takes most time, and the second one also includes invocation counts for each method. And the rule is:

One should look at both sampling profiles and the ones with invocation counts.

Sampling profiles show you where a program is slow. Tracing profiles show why it's slow. It's quite often that I see some innocent things in the snapshots, like getting a value from a hash table. This doesn't mean that I should optimize hash table, it means that it's called zillions of times. So I should step back several stack frames and look carefully at the algorithm, trying to decrease its time complexity. Invocation count information helps to find the exact abstraction level I should pay attention to.

Sadly one can't use only tracing profiles, even if they also report the time each method takes. It's like quantum mechanics: the tracing overhead affects the time reported too significantly, and one can't trust it anymore.

In some cases it's obvious where the algorithm is inefficient, and sampling snapshots are enough. But increasingly often I find it to be not the case.

Yes, it's all well-known. And it's been about 15 years ago that I first read (and agreed) that algorithm optimization gives much better results than low-level one. But to learn something and to realize it, understand it deeply, are completely different things.

Saturday, September 21, 2013

Data flow analysis

I do many things in IntelliJ IDEA. One of the most interesting of them is data flow inspection, also known to the world under a somewhat strange name Constant Conditions & Exceptions. It analyzes your Java code, checks how you assign variables and what you write in if conditions about them, and issues warnings like Condition is always true or This method call can produce NullPointerException. The analysis it performs is quite advanced. Sometimes, it requires some thinking to understand that a warning is actually correct. Users even complain it's too intelligent (e.g. 1, 2, 3)!

I didn't maintain this inspection code from the very beginning, only in the last several years. During this time, I did several improvements, not counting numerous bug fixes. I like this inspection. I like even its bugs, because it's fun fixing them. So if you see a bug, please report it, I'll be grateful for this new fun!

Closures

I've supported anonymous classes, whose analysis should be aware of what happens outside the class. For example, o cannot be null inside Runnable and no warnings should be reported:

o = getSomeObject();
if (o != null) {
new Runnable() {
public void run() {
o.doSomething();
}
};
}

Repeated invocations

The inspection used to warn on the following code:

@Nullable File getFile() {...}
...
if (getFile() != null) return getFile().getName();

It's perfectly correct, but quite inconvenient. Correct because getFile is marked as @Nullable, it's a method and its result may change between invocations. Even if it was a mutable field named file, it could be reassigned from another thread. But it's an inconvenient warning because most of the code out there isn't subject to these conditions. And I don't feel like making users change their perfectly valid code by extracting a variable out of getFile() just to make a tool happy.

So now IDEA produces no possible NPE warning on such calls. What's complicated about this fix is that it also doesn't presuppose getFile() being not-null. So if you write code targeted to mutable environment, and check the null-ness of the file again in case it's changed, this second if (getFile() != null) check is not reported as an always true condition. What happens inside is that after any method call (getFile in this case) IDEA marks this expression as having an unknown value, and doesn't track any possible constant conditions involving it.

Tracking other objects

I've also supported data flow on final fields of other instances, not only this. And on non-final fields. And on getter method calls on other objects. And on expression chains comprised of all of the above. The main catch is to wisely forget the facts you know about all those expressions when it's time (e.g. on qualifier reassignment).

Contracts

More recently, I've added declarative method contract support to the inspection. It was quite easy. What was hard, was not to generate any excessive warnings because of a contract. If you call a method transform(o) where transform has a contract of null->null, then IDEA analyzes the method call as o == null ? null : transform(o). And whenever IDEA sees that you compare a variable to null, it unsurprisingly starts suspecting that it might actually be null! So here's the discrepancy: with such a contract, IDEA actually sees the comparison with null, but the user doesn't. It's just a normal method call, but its argument mysteriously is considered nullable after the call. As I've said, it was hard to do and took me more than a week of thinking, but I did it finally!

It remains to add checking that the contract corresponds to what the method is actually doing, and to infer contracts automatically. But it's quite non-trivial, and I'm still trying to understand how it could be implemented.

Value is always constant

if (var==null) { ... someMethod(var) ... }

Quite probably, passing to a method a variable which is known to be null, is a bug and the user intended something else. Even if it's not, writing null instead of var may make the code clearer. So the inspection reports such uses of variables with values known to be null, true or false and suggests to replace them with constants.

Too complex?

Finally, the thing I did yesterday and I'm really proud of. There's a problem: IDEA marks some methods as being too complex to analyze. This warning irritates people and not everyone understands it means you don't get any NPE warnings inside that method. The warning appears in two cases:

when it just takes too long to analyze the method
when the number of possible variable states is too large

Having a profiler, the first cause is relatively easy to fix (and is actually quite rare). The second one is harder.

So, there's too many data flow states. Each state is basically a set of assertions about variables and their values. IDEA tracks these states for each point in your code, and each point may correspond to many states. For example, a nullable variable is represented by two states: in one of them it's null, in the other it's not-null. If you call a method on this variable, both states are considered, and if in any of them the variable is null (which is true) the NPE warning is issued.

For simplicity let's assume that all assertions are either in form variable==value or variable!=value.

So, how to get many-many different states? It's very easy. You may have, say, 10 variables. Just compare each of them with some constant value, and here you are:

//1 state here
if (var1 == 0) { print("var0"); }
//2 states here: var1==0 and var1!=0
if (var2 == 0) { ... }
//4 states here:
//var1==0 && var2==0, var1==0 && var2!=0
//var1!=0 && var2==0, var1!=0 && var2!=0
...
if (var10 == 0) {...}
//1024 states here, too complex :(

But it need not be so sad! It's not too complex for people to understand, why should it be so for the IDE? The 2 states after the first if statement actually are complementary, they only differ in their assertions about var1. In fact, they can safely be replaced with just one state with no assertions about var1.

And that's the basic idea. At each point where different control flow branches join, all the states are gathered and checked if they can be merged. This has reduced the number of states dramatically, at least for the samples of "too complex" methods in our code base that I've looked at. Hence, many methods are not too complex anymore, users are not annoyed and get useful highlighting instead. Profit!

One issue is that determining which states can be merged is quite computationally expensive at the moment, and it doesn't merge all the states it could. But I'm working on it.

Thursday, March 28, 2013

Java JIT and code cache issues

If you have a lot of Java code, you may run out of code cache used by JIT to compile frequently used code. When this happens, you can get an OutOfMemoryError or (if you're lucky) just a message in the console that JIT is stopped and the rest of the newly loaded code will be interpreted, so, slow.

You can increase the code cache size by specifying -XX:ReservedCodeCacheSize=xxx, but you'll run into the same problem after you add more code.

You can enable -XX:+UseCodeCacheFlushing to deal with that. It's supposed to make some cleanup when the code cache gets full, to free space for new method compilation.

Today I learned that it's not as good as it sounds, by reading some 25M of JIT logs on IntelliJ IDEA process (JDK 1.7 64-bit). The code cache got full, it was flushed, which really freed up some decent amount of code cache space. That's OK. There was a problem though. After this event all the JIT compilation was stopped. So the compiled code that happened to be flushed away was interpreted now, no matter how hot-spottish it really is.

This is quite noticeable. For example, code completion and goto-class navigation suddenly become very slow after a day of work. They match lots of names against your typed prefix, and unfortunately some of this matching code gets deoptimized and becomes visible in the CPU snapshots.

What to do? Wait for Oracle to fix this. And for now we increased code cache size for 64-bits JVMs.

Maybe we should also take a closer look at the logs and try to reduce the hotspot count. Just look at the methods that are compiled at runtime and try calling them less frequently to prevent them occupying code cache at all. This could also improve the overall performance.

Monday, September 3, 2012

Groovy highlighting, continued

I was wrong. I thought that ignoring method argument types was enough to avoid complex data flow with cyclic dependencies. I absolutely forgot about method call qualifier. It can be complex and contain many local variable references which may easily lead to cyclic dependencies. Too many method calls have qualifiers, and ignoring all of them would render the argument-based type inference feature.

I tried to decrease the performance slowdown by caching the things that are invoked during resolve but don't depend on other local variables and thus don't lead to cyclic dependencies and are actually cached. For example, the list of all non-local declarations visible from a particular reference in code. That gave some speedup, but not enough for me.

So I remembered that once there was a completely different type inference algorithm. It was control-flow-based, it was smart, it inferred the local variable types for the whole method at once. It walked the control flow graph and incrementally refined the variable types. I had abandoned it because it had three issues:

Each iteration used the previously computed types which were stored in a global thread-local variable. Not beautiful.
The results based on these partially-computed types were cached elsewhere and used by other highlighting threads which led to spurious errors in the editor.
Sometimes whole-method inference is not needed and slow, e.g. when you're searching for usages and want to resolve just one reference as quickly as possible.

Now I had to give a second thought about abandoning that algorithm. OK, it had all those problems, but at least it was fast. So why not try to solve those problems another way?

Non-beautiful global thread-local state. That's easy, you just have to pass this state around to all the methods that deal with type inference and resolve. Easy, yes, but quite tiresome. After about two hours of non-intellectual parameter propagation, when my changelist exceeded 50 files, I decided I can live with this global thread-local state and reverted everything.
The problem of caching. It's actually easy as well. There's not many places whose caching leads to multi-threading problems, so it's easy to introduce yet another indirection level there and cache those results in the same global thread-local state during type inference.
Finally, the "find usages will be slow again" problem. One can analyze the variable dependency graph on each other and infer the types for a strongly-connected subcomponent of that graph. So far I haven't done this though.

But I have done everything else! I've restored the algorithm and adapted it to the new feature set, which was tricky. Now Groovy type inference in IntelliJ IDEA doesn't introduce any cyclic dependencies caused by local variables and is not exponential. Seems fast enough on my cases, but of course, there's always something to improve in that area. In fact, for some code it's a bit slower now, I have to investigate that.

Thursday, August 30, 2012

Recursion, caching and Groovy highlighting

It's hard to write anything (e.g. my parser) when it takes 20 seconds for IntelliJ IDEA to highlight a not very large Groovy file. And after you rename anything, you wait the same amount of time while it's optimizing imports. Today I finally decided to investigate the issue. Most of the highlighting time is spent in reference resolution: an IDE must know what every identifier refers to.

Recursion

As it happens, nowadays most performance problems in reference resolution are caused by cyclic dependencies. That means, resolving one reference requires resolving another one, and that one in turn requires the first one. As we use data flow to determine the variable types in Groovy, this is easy to encounter when you have a loop:

def a = new A()
def b = new B()
while (...) {
a = b.nextA()
...
b = a.nextB()
}

In a = b.nextA() one should first determine the type of b, then find a nextA method on that type and finally use its return type for a. But we should take into account that this might be not a first iteration of the loop, therefore b might be assigned in the very bottom of the loop body, so its type should be taken from there. And there it depends on the type of a defined in the same assignment that we are looking at.

That's just impossible to figure out. So if we discover such a cyclic dependency, we give up on inferring this particular expression type and use other assignments available. In this case a would just have the type A even if b.nextB() actually returns some AImpl.

Technically, it's just a stack overflow prevention. We maintain the list of expressions that we are currently calculating the type of. If we're asked for a type of such an expression, we just return null. The caller is ready for that.

Caching

Things get more complicated because we also cache the type of each expression. The problem is, we shouldn't cache null returned by stack overflow prevention functionality. Moreover, we shouldn't cache anything that depends on that null. Because in the end the type of a will not be null, it'll be A. And anything that depends on it will have a normal type based on A. If we cached an incomplete type, another highlighting thread would come and use it, and highlight a wrong error.

That's why endless recursion prevention and caching should know about each other. In IDEA, we have RecursionManager class which does precisely that.

As a result, if we have lots of cyclic dependencies, we don't cache lots of things. In fact, RecursionManager tries to memoize some calculation results that don't directly lie on a cyclic dependency. And this memoization is what makes this class really complex. It speeds up things quite a bit, but still, the best solution is not to create cyclic dependencies unless one really needs to.

Back to Groovy

So I went commenting out various parts of my Groovy code and putting a breakpoint on return null in RecursionManager. And here's what I found.

There were a couple of plain bugs which also led to cyclic dependencies: incorrect control flow for for-in loops, and too wide search scope when resolving a super class name.

Some Groovy AST transformations (e.g. @TupleConstructor) add constructors with parameters based on the class properties. Property list retrieval requires traversing all class methods and fields. Constructors are also methods in IDEA's terminology, so the transformation handler depends on itself. Fixed in the handler by carefully querying only the physical fields and methods, without running any extenders.

Finally, a new feature in IDEA 12. Each time a variable is passed to a method, IDEA checks the corresponding formal parameter type and narrows the inferred variable type accordingly if needed. Example:

def s = ...
if ("abc".contains(s)) ...
// now IDEA knows that s is a String

Unfortunately to resolve a method one should know the argument types. Moreover, after resolving one should also map arguments to parameters which is quite non-trivial in Groovy given all those default parameter values, named arguments. One should choose a correct overload. If a method is parameterized, then one argument type may be inferred from another. And we are right now calculating and refining the type of one of the arguments. Another cyclic dependency, quite a nasty one.

The best solution I have invented so far is to restrict this feature. Only allow cases when we can unambiguously resolve the method and map arguments to parameters without knowing argument types. This rules out methods with overloads, with default arguments, with generic parameters. Most of the methods I've ever encountered don't have it all anyway.

That's how I spent about 7 hours today. With all those changes (mostly with the last one) it now takes 3 seconds to highlight a file (instead of 20). Still not ideal, but bearable. And I can finally relax, learn some new things about Model Thinking and sleep.

UPD. Continued.

Sunday, July 22, 2012

Immutable data structures in OOP (Groovy)

In my parser, everything except some minor auxiliary things is immutable, and I cannot express how valuable it is. It helps debugging, testing, backtracking in the algorithm. But there are certain issues with immutability in Groovy (which I use for its syntactic sugar) and other object-oriented languages.

Update

One very often has to copy an object with a subset of its field values replaced, leaving the others as they are. Functional languages normally have special syntax for that; mainstream object-oriented languages normally don't, including Groovy. So I had to resort to an updating function like copyWith variant for Scala. Manually and without any reflection so far. When adding a field, I also have to update the default constructor, the all-fields constructor and the cloning function. Luckily that's not very often, otherwise I'll probably create an AST-transforming annotation that would do this automatically for me.

Collections

Groovy has no built-in persistent collections: immutable but cheap to create an updated copy of. I've found that Java ecosystem is not rich in these kinds of libraries at all, and that's a pity. Scala and Clojure have them, but tuned to those languages and I couldn't be bothered to use them so far. There are also pcollections and Functional Java libraries which don't have everything I need (e.g. a persistent version of LinkedHashMap). So I have either to create something myself, or use the standard Java collections very carefully, or just give up and rewrite everything in another language. So far I'm combining the first and the second variants.

State flow

I've found is that writing complex logic in instance methods may be quite error-prone. I have ParsingState class with apply method which is invoked when processing each word. It takes some possibly contradicting language constructions that a word may be part of, and decides which to activate, i.e. which parsing alternative to choose. This stage involves trying all the alternatives and evaluating their semantic plausibility. Something like this:

Whenever you change anything (obtain a changed copy of ParsingState), you must reassign it to state variable. Not a big deal, but a bit tiresome. Worse is this: everything except the first line has to be prefixed with state, because you need to operate on the latest possible version of the parsing state. And it's so damn easy to forget this qualifier! The IDE will autocomplete it, the compiler will eat it and most probably the problem won't be noticed for some time.

After several such bugs I'm very inclined to wrap all complex logic in static methods where you can't have unqualified references. There's normally only one possible state qualifier then, and if you have some special case, you add another variable to keep an earlier version of state, and to mix them is not so easy. At the same time, I like calling instance methods. It's very convenient: there's a namespace clearly defined by the qualifier type, it contains only what you need. As a result, I create an instance method which immediately calls some private static one where all the logic lies.

Conclusion

So, here's my wishlist for an ideal language:

Support for creating immutable structures easily. I find Scala way to be the most concise. Groovy offers @TupleConstructor annotation which is nice but not a part of the language.
Support for updating immutable structures so that adding a new field requires changing just one place.
Persistent collections with a predictable iteration order. An ability to create them using concise list/map literals would be a plus.
A way of expressing complex state transition logic concisely, clearly and not so error-prone. Like State monad, only simpler, something for human beings. Maybe just a syntactic sugar for var=f(var). Computing other values alongside state change as well: var,sideResult=f(var).

Monday, June 18, 2012

7 stages of exploratory programming

One has an idea on natural language understanding, starts implementing it. Cool, it works!
One encounters some sentences not easy to deal with. It's not clear how to do it the right way, beautifully, so one does it somehow — by dirty hacking. Hopefully, later it'll become more clear when new data arrives.
As more and more sentences sentences are encountered, some hacks are corrected but others are added. Hacks proliferate and get tightly intertwined.
It becomes harder to parse new sentences. Hacks start getting in the way. Fortunately, one starts seeing how to avoid them.
Correcting the design appears to be non-trivial because hacks depend on each other. So to fix one you should first correct another one, which requires third one, and so on. There's a whole dependency graph of hacks!
The test suite doesn't seem a friend anymore. It keeps failing, revealing more and more ways hacks depend on each other. No new sentences get parsed anymore.
After half a year of pure refactoring, the morale is low and still no light at the end of the tunnel. Yet another cyclic hack dependency is found. Tired, one decides to apply all the desired design changes at once and fix the tests one by one, almost as if writing the parser afresh. Goto 1.

Wednesday, February 9, 2011

Declarative programming

If you have a problem and think "I'll describe everything as data in a nice declarative way (maybe even XML)", now you have two problems.

Seriously, unless you can prove your software will never be extended, don't program declaratively. (I'm speaking mostly to myself here but if this helps anyone else, you're welcome). The reason is obvious: declarative description is very pretty and concise as long as it's simple. But usually it becomes extremely complex to change things a bit if that bit isn't foreseen from the beginning. And the real world is always like that. I've written quite a few frameworks and declarative parsers and there hasn't been any one where I didn't regret that when adding new functionality.

Simple things should be simple, complex things should be possible (c). With declarative code they usually end up being impossible. Even if they're possible, the efforts you spend are huge, and it looks like a hack. Just look at Ant.

Alternatives? Write code that does something, not the one that creates some structures which are then stored, sorted, converted into other structures, stored and sorted again and then finally some completely different code does something based on these structures. Just do it as straightforwardly as possible. Execute the code instead of creating data objects. Code invocation may in fact look quite DSLish (even in Java), so the not-so-big-if-any lengthiness will definitely pay off in the long run when you have to change the foundations. And it's muuuch easier to debug this way.

The moral: when writing data instead of code, think again. And again. And again.

Sepulkarium