Weakly Reachable

... until the collector arrives ...

This "blog" is really just a scratchpad of mine. There is not much of general interest here. Most of the content is scribbled down "live" as I discover things I want to remember. I rarely go back to correct mistakes in older entries. You have been warned :)

2015-01-29

OSGI Configuration Area Considered Harmful

Our Eclipse RCP application has a test suite that is itself packaged up as an RCP application. One of the unit tests started failing, complaining that the expected hash code for the contents of a certain file is not as expected. The file had indeed been changed, and the test had been updated accordingly. Yet the test failure reported that the hash was the old one instead of the new one. We double-, triple-, quadruple-checked and verified that the correct file was on disk. And the hash code in the unit test matched the new file.

To add suspense, the test would only fail on one machine, the build server of course. The test passed on every other machine we tried, whether we ran it in the IDE or in a full headless build.

After much head-scratching to no avail, we finally went low tech and modified the code running on the build server to print out the path of the file in question. We were surprised, but not surprised, to find that the file being used by the test was not anywhere in the source tree, nor in the temporary build area. Rather, it was in the OSGI Configuration area.

Apparently, OSGI copies certain resources from their host plug-ins into the configuration area. Our code had asked for a file path (not URL) to the resource, so that behaviour is fair enough. What is confusing is that OSGI never refreshed that file even when the source resource had changed. Some of the resources had been refreshed, but not that one. I suppose that is what "eclipse -clean" is all about?

In hindsight, this is not the first such weird behaviour that I have seen with respect to the configuration area. I can recall instances when new perspective layouts, or even simple UI part labels, would not reflect their current values until the configuration area was cleared.

In case anyone was wondering, it is failures like this that drove us to package the test suite as an RCP application in the first place. Running the tests in the OSGI environment can bring forth obscure (and irritating) behaviours such as this.

Moral: if an OSGI app is behaving in a manner that defies explanation, see if the behaviour persists after clearing the configuration area.

2014-10-17

SQL Character Collation

The following SQL Server query will generate a list of characters for the first 65,536 code points in collation order. For illustration purposes, a WHERE clause has been added to show only those code points that lie within the printable ASCII range:

WITH
  n AS (
    SELECT 0 AS n
    UNION ALL SELECT n + 1 FROM n  WHERE n < 65536
  )
, c AS (
    SELECT n AS codePoint, NCHAR(n) AS char FROM n
  )
, coll AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY char, codePoint) AS seq FROM c
  )
SELECT
  *
FROM coll
WHERE codePoint BETWEEN 32 AND 126
ORDER BY seq
OPTION (MAXRECURSION 0)

2014-08-14

Passing integer lists to SQL Server, Redux

Newer versions of SQL Server support array-valued query parameters, but not all database drivers support them (e.g. jTDS or, at time of writing, the Microsoft JDBC driver). It is a widely known pattern to pass lists of integers in a CSV string, and then to use a TSQL function to parse that string. This method is also known to be quite slow for large lists. Here is a TSQL function alternative that uses a binary representation of the integers, and runs about an order of magnitude faster than the CSV approach:

CREATE FUNCTION dbo.fnBinaryToIntTable(@idList VARBINARY(MAX))
RETURNS TABLE AS RETURN
WITH indices AS (SELECT 0 AS i UNION ALL SELECT i + 1 FROM indices)
SELECT TOP (LEN(@idList) / 4)
  i + 1 AS idx
, CAST(SUBSTRING(@idList, i*4 + 1, 4) AS int) AS n
FROM indices

GO

SELECT * FROM dbo.fnBinaryToIntTable(CAST(134 AS BINARY(4)) + CAST(463 AS BINARY(4)))
OPTION(MAXRECURSION 0)

Unfortunately, this function requires the calling query to specify OPTION(MAXRECURSION 0) for lists with more than 100 elements. This is a significant source of error as it not an obvious calling requirement. It is possible to bake the option into the query by using an intermediate table variable:

CREATE FUNCTION dbo.fnBinaryToIntTable(@idList VARBINARY(MAX))
RETURNS @result TABLE(idx INT, n INT)
AS BEGIN
  WITH indices AS (SELECT 0 AS i UNION ALL SELECT i + 1 FROM indices)
  INSERT INTO @result
  SELECT TOP (LEN(@idList) / 4)
    i + 1 AS idx
  , CAST(SUBSTRING(@idList, i*4 + 1, 4) AS int) AS n
  FROM indices
  OPTION(MAXRECURSION 0)
  RETURN
END

... but this performs only 2-3x better than the CSV solution. Trade-offs, trade-offs.

2014-08-07

Com4J leaks COM objects

com4j provides a helper class ComObjectCollector. It is used to track and dispose of all COM objects created within a sequence of operations. For example:

ComObjectCollector objects = new ComObjectCollector();
COM4J.addListener(objects);

_Application xl = excel.ClassFactory.createApplication();
Workbooks workbooks = xl.getWorkbooks();
_Workbook workbook = workbooks.add(null, 0);
Com4jObject sheetObj = workbook.getWorksheets(1);
_Worksheet sheet = sheetObj.queryInterface(_Worksheet.class);
// ...

objects.disposeAll();
// BUG: "sheet" should be disposed at this point, but is not
COM4J.removeListener(objects);

Unfortunately, version com4j-20120426-2 leaks a COM reference in the exhibited code. Specifically, the reference held by sheet is not disposed by objects.disposeAll.

The reason is subtle. ComObjectCollector internally uses a WeakHashMap to hold the object references. The references are represented by the com4j Wrapper class. Wrapper redefines the equals method to account for aliased references. It turns out that this method considers sheetObj and sheet to be equal even though they are actually distinct references that need to be disposed individually. sheet is therefore not actually added to the map because of that equality. Consequently, it is not disposed by objects.dispoaseAll().

Incidentally, Wrapper should not be used with WeakHashMap since the latter stipulates that it should only be used with objects that implement identity equality.

The fix to the leak is actually quite simple... re-implement the trivial class ComObjectCollector. Here is an inline version:

final Map _objects = new IdentityHashMap<>();
ComObjectListener listener = new ComObjectListener() {
    @Override
    public void onNewObject(Com4jObject obj) {
        _objects.put(obj, null);
    }
};
COM4J.addListener(listener);

_Application xl = excel.ClassFactory.createApplication();
Workbooks workbooks = xl.getWorkbooks();
_Workbook workbook = workbooks.add(null, 0);
Com4jObject sheetObj = workbook.getWorksheets(1);
_Worksheet sheet = sheetObj.queryInterface(_Worksheet.class);
// ...

for (Com4jObject obj : _objects.keySet()) {
    obj.dispose();
}
COM4J.removeListener(listener);

This version uses strong references instead of weak references, but that should not be a problem in the absence of large numbers of transient references. A more elaborate implementation would have to either change the equality semantics of Wrapper (requiring an impact analysis), or hold more direct references to the underlying COM objects in the WeakHashMap.

Blog Archive