tag:blogger.com,1999:blog-40325318709368313302024-02-18T19:44:42.577-07:00Weakly Reachable... until the collector arrives ...MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comBlogger589125tag:blogger.com,1999:blog-4032531870936831330.post-78768353453615889802017-05-24T11:43:00.001-06:002017-05-24T11:52:51.246-06:00Obsolete Cookie Support in Java<p>The current HTTP cookie specification is <a href="https://tools.ietf.org/html/rfc6265">RFC 6265</a>, published back in 2011. It obsoletes <a href="https://tools.ietf.org/html/rfc2965">RFC 2965</a> (from 2000) which in turn obsoletes <a href="https://tools.ietf.org/html/rfc2109">RFC 2109</a> (1997). RFC 6265 retired support for the pseudo-cookies whose names start with a dollar sign. Such cookies used to be interpreted as cookie attributes (e.g. <em>Path</em>, <em>Version</em> etc). Web servers will now often reject requests containing cookie names that start with "$".</p>
<p>
Unfortunately, the Java HTTP client does not yet support RFC 6265 (as of version 8). It still implements RFC 2965 (and supports RFC 2109). In particular, it generates $-prefixed pseudo-cookies and is thus incompatible with many contemporary servers. The incompatibility may manifest as <em>400 Bad Request</em> errors, and even then only when using secure protocols where HTTP message syntax checking is more stringent.</p>
<p>
Use the <a href="https://hc.apache.org/">Apache client</a> instead.
</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-82558517490241499802016-08-15T11:37:00.000-06:002016-08-15T11:37:04.954-06:00Windows RDP vs. Password Changes<p>
On Windows, there is an obscure security policy that prohibits changing one's password for the first time through Remote Desktop. Normally, one can type CTRL-ALT-DEL to bring up a menu that, among other things, allows a password change. When logged in remotely using RDP, the key chord is CTRL-ALT-END. However, Windows can detect this situation, and if the very first password change is attempted after using CTRL-ALT-END then it will be rejected as "access denied". There is a work-around, though:
</p>
<ol>
<li>Bring up the onscreen keyboard (OSK.EXE).</li>
<li>Type CTRL-ALT using the regular keyboard.</li>
<li>Press DEL on the onscreen keyboard.</li>
<li>The password change function will now operate normally (i.e. no "access denied").</li>
</ol>
<p>
This is helpful on virtual machines where remote access is the only possibility. The purpose of this strange security policy remains unclear.
</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-82252127988518761932016-08-15T11:29:00.000-06:002016-08-15T11:29:16.273-06:00Windows Event Logs: Command-line Access<p>The following Windows command will extract events from the <em>Security</em> log on the machine <em>somehost</em> provided those events have the ID <em>4624</em> (login) and are for the user <em>someuser</em>:</p>
<blockquote><code><pre>wevtutil ^
query-events ^
/r:somehost ^
Security ^
/q:"Event[System[EventID=4624] and EventData[Data[@Name='TargetUserName']='someuser']]"</pre></code></blockquote>
The query syntax is described as being XPath that operates upon the XML representation of the events. Beware that it is a actually weird subset of XPath that does not recognize <code>.</code>, <code>..</code>, <code>/</code>, or <code>//</code>. Also, the generated XML results are free-standing fragments for each event (i.e. not wrapped by a parent element and thus not well-formed XML).MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-45642817419906956722016-06-23T09:22:00.002-06:002016-06-23T09:22:31.463-06:00Dependency Injection Frameworks<p>System requirements that justify the use of a dependency injection framework:</p>
<ol>
<li>The system has at least 64 components whose implementations must be selectable at run-time by editing an external text file.</li>
<li>The development team is not permitted to know the application initialization sequence.</li>
</ol>
<p>:)</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-88483413424171588742015-08-17T12:54:00.001-06:002015-08-17T12:54:26.893-06:00LOOKUP vs. Aspose Cells for Java<p>Consider the following Excel formula:</p>
<code>=LOOKUP(2, {"a",1,2,3}, {"x","one","two","three"})</code>
<p>In Excel 2013, this returns "two". In Aspose Cells for Java 8.3.2, it returns "N/A".</p>
<p>The Excel documentation is wishy-washy on what the correct result should be:</p>
<blockquote><em>
Important: The values in array must be placed in ascending order: ..., -2, -1, 0, 1, 2, ..., A-Z, FALSE, TRUE; otherwise, LOOKUP might not return the correct value. Uppercase and lowercase text are equivalent.
</em></blockquote>
<p>
This tells us that it is inadvisable for our look-up table to have <code>"a"</code> before the numeric values <code>1, 2, 3</code>. The Cells result is probably superior to Excel's, but unfortunately it can mean that a user's spreadsheet can produce different results.
</p>
<p>
This is obviously a corner case, and a spreadsheet that depends upon a well-defined result in such circumstances is asking for trouble. I reported this to Aspose not so much for a fix, but rather to potentially save another user from a long debugging session.
</p>MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-64568224879123884682015-01-29T15:13:00.000-07:002015-01-29T15:13:09.881-07:00OSGI Configuration Area Considered Harmful<p>Our Eclipse RCP application has a test suite that is itself packaged up as an RCP application. One of the unit tests started failing, complaining that the expected hash code for the contents of a certain file is not as expected. The file had indeed been changed, and the test had been updated accordingly. Yet the test failure reported that the hash was the old one instead of the new one. We double-, triple-, quadruple-checked and verified that the correct file was on disk. And the hash code in the unit test matched the new file.</p>
<p>To add suspense, the test would only fail on one machine, the build server of course. The test passed on every other machine we tried, whether we ran it in the IDE or in a full headless build.</p>
<p>After much head-scratching to no avail, we finally went low tech and modified the code running on the build server to print out the path of the file in question. We were surprised, but not surprised, to find that the file being used by the test was not anywhere in the source tree, nor in the temporary build area. Rather, it was in the OSGI Configuration area.</p>
<p>Apparently, OSGI copies certain resources from their host plug-ins into the configuration area. Our code had asked for a file path (not URL) to the resource, so that behaviour is fair enough. What is confusing is that OSGI never refreshed that file even when the source resource had changed. Some of the resources had been refreshed, but not that one. I suppose that is what "eclipse -clean" is all about?</p>
<p>In hindsight, this is not the first such weird behaviour that I have seen with respect to the configuration area. I can recall instances when new perspective layouts, or even simple UI part labels, would not reflect their current values until the configuration area was cleared.</p>
<p>In case anyone was wondering, it is failures like this that drove us to package the test suite as an RCP application in the first place. Running the tests in the OSGI environment can bring forth obscure (and irritating) behaviours such as this.</p>
<p>Moral: if an OSGI app is behaving in a manner that defies explanation, see if the behaviour persists after clearing the configuration area.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-50189870750554838232014-10-17T12:45:00.000-06:002014-10-17T12:45:16.354-06:00SQL Character Collation<p>The following SQL Server query will generate a list of characters for the first 65,536 code points in collation order. For illustration purposes, a <code>WHERE</code> clause has been added to show only those code points that lie within the printable ASCII range:</p>
<blockquote><code><pre>WITH
n AS (
SELECT 0 AS n
UNION ALL SELECT n + 1 FROM n WHERE n < 65536
)
, c AS (
SELECT n AS codePoint, NCHAR(n) AS char FROM n
)
, coll AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY char, codePoint) AS seq FROM c
)
SELECT
*
FROM coll
WHERE codePoint BETWEEN 32 AND 126
ORDER BY seq
OPTION (MAXRECURSION 0)</pre></code></blockquote>MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-57414256439744794872014-08-14T12:12:00.001-06:002014-08-14T12:18:13.133-06:00Passing integer lists to SQL Server, Redux<p>Newer versions of SQL Server support array-valued query parameters, but not all database drivers support them (e.g. jTDS or, at time of writing, the Microsoft JDBC driver). It is a widely known pattern to pass lists of integers in a CSV string, and then to use a TSQL function to parse that string. This method is also known to be quite slow for large lists. Here is a TSQL function alternative that uses a binary representation of the integers, and runs about an order of magnitude faster than the CSV approach:</p>
<blockquote><code><pre>CREATE FUNCTION dbo.fnBinaryToIntTable(@idList VARBINARY(MAX))
RETURNS TABLE AS RETURN
WITH indices AS (SELECT 0 AS i UNION ALL SELECT i + 1 FROM indices)
SELECT TOP (LEN(@idList) / 4)
i + 1 AS idx
, CAST(SUBSTRING(@idList, i*4 + 1, 4) AS int) AS n
FROM indices
GO
SELECT * FROM dbo.fnBinaryToIntTable(CAST(134 AS BINARY(4)) + CAST(463 AS BINARY(4)))
OPTION(MAXRECURSION 0)</pre></code></blockquote>
<p>Unfortunately, this function requires the calling query to specify <code>OPTION(MAXRECURSION 0)</code> for lists with more than 100 elements. This is a significant source of error as it not an obvious calling requirement. It is possible to bake the option into the query by using an intermediate table variable:</p>
<blockquote><code><pre>CREATE FUNCTION dbo.fnBinaryToIntTable(@idList VARBINARY(MAX))
RETURNS @result TABLE(idx INT, n INT)
AS BEGIN
WITH indices AS (SELECT 0 AS i UNION ALL SELECT i + 1 FROM indices)
INSERT INTO @result
SELECT TOP (LEN(@idList) / 4)
i + 1 AS idx
, CAST(SUBSTRING(@idList, i*4 + 1, 4) AS int) AS n
FROM indices
OPTION(MAXRECURSION 0)
RETURN
END</pre></code></blockquote>
<p>... but this performs only 2-3x better than the CSV solution. Trade-offs, trade-offs.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-83983793770400402532014-08-07T08:48:00.001-06:002014-08-07T08:52:20.146-06:00Com4J leaks COM objects<p><a href="http://com4j.kohsuke.org/">com4j</a> provides a helper class <code>ComObjectCollector</code>. It is used to track and dispose of all COM objects created within a sequence of operations. For example:</p>
<blockquote><code><pre>ComObjectCollector objects = new ComObjectCollector();
COM4J.addListener(objects);
_Application xl = excel.ClassFactory.createApplication();
Workbooks workbooks = xl.getWorkbooks();
_Workbook workbook = workbooks.add(null, 0);
Com4jObject sheetObj = workbook.getWorksheets(1);
_Worksheet sheet = sheetObj.queryInterface(_Worksheet.class);
// ...
objects.disposeAll();
// BUG: "sheet" should be disposed at this point, but is not
COM4J.removeListener(objects);</pre></code></blockquote>
<p>Unfortunately, version <em>com4j-20120426-2</em> leaks a COM reference in the exhibited code. Specifically, the reference held by <code>sheet</code> is not disposed by <em>objects.disposeAll</em>.</p>
<p>The reason is subtle. <code>ComObjectCollector</code> internally uses a <code>WeakHashMap</code> to hold the object references. The references are represented by the com4j <code>Wrapper</code> class. <code>Wrapper</code> redefines the <code>equals</code> method to account for aliased references. It turns out that this method considers <code>sheetObj</code> and <code>sheet</code> to be equal even though they are actually distinct references that need to be disposed individually. <code>sheet</code> is therefore not actually added to the map because of that equality. Consequently, it is not disposed by <code>objects.dispoaseAll()</code>.</p>
<p>Incidentally, <code>Wrapper</code> should not be used with <code>WeakHashMap</code> since the latter stipulates that it should only be used with objects that implement identity equality.</p>
<p>The fix to the leak is actually quite simple... re-implement the trivial class <em>ComObjectCollector</em>. Here is an inline version:</p>
<blockquote><code><pre>final Map<Com4jObject, Object> _objects = new IdentityHashMap<>();
ComObjectListener listener = new ComObjectListener() {
@Override
public void onNewObject(Com4jObject obj) {
_objects.put(obj, null);
}
};
COM4J.addListener(listener);
_Application xl = excel.ClassFactory.createApplication();
Workbooks workbooks = xl.getWorkbooks();
_Workbook workbook = workbooks.add(null, 0);
Com4jObject sheetObj = workbook.getWorksheets(1);
_Worksheet sheet = sheetObj.queryInterface(_Worksheet.class);
// ...
for (Com4jObject obj : _objects.keySet()) {
obj.dispose();
}
COM4J.removeListener(listener);</pre></code></blockquote>
<p>This version uses strong references instead of weak references, but that should not be a problem in the absence of large numbers of transient references. A more elaborate implementation would have to either change the equality semantics of <code>Wrapper</code> (requiring an impact analysis), or hold more direct references to the underlying COM objects in the <code>WeakHashMap</code>.</p>MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-62351660552523681342014-08-05T15:56:00.000-06:002014-08-05T15:57:29.243-06:00Broken Whole Row References in Apache POI<p>There is a bug in the Apache POI <a href="http://grepcode.com/file/repo1.maven.org/maven2/org.apache.poi/poi/3.10-FINAL/org/apache/poi/ss/util/AreaReference.java?av=f#172">AreaReference.getWholeRow()</a>. It returns a reference like <code>$A1:$IV1</code>. This is true for old Excel workbooks, but for the new format that POI supports as <code>XSSFWorkbook</code>, the reference should be <code>$A1:$XFD1</code>.</p>
<p>If you attempt to work around this by defining an explicit name for <code>$A1:$XFD1</code>, Excel will quietly rewrite the reference to <code>$1:$1</code>, triggering the bug once again in POI. A nasty work-around is to define a name as <code>$A1:$XFC1</code> (note: XFC instead of XFD). Excel leaves this reference untouched and POI handles it correctly. Too bad it is an awfully inconvenient work-around.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-87203669119346263782014-04-24T16:18:00.000-06:002014-04-24T16:18:02.032-06:00Hashing + Doubles = Sneaky Nondeterminism in a JUnit Test<p>One of our calculation unit tests would fail erratically. Most of the time, it would pass. But on some runs (one in five?) one calculated double value would mysteriously flip from 0.0 to -0.0. Floating-point numerical errors are well-known, but the nondeterminism was a surprise as we could not think of any random or time-based elements in our calculation.</p>
<p>The cause turned out to be missing hashCode()/equals() on an object that was placed into a map. In such cases, the identity hash is used. The identity hash is typically based upon a memory location or a handle identifier and, as such, can easily change from run to run. The entries in the map were being iterated over, but the changing hashes would cause the iteration order to change. Elaborately calculated values from each iteration were being summed, and the changing summation order would sometimes produce a -0.0 from an underflow.</p>
<p>The nondeterminism did not produce any material errors in the calculated results -- strictly speaking there was not a bug in the code. However, the intermittent failures of JUnit tests was unhelpful.</p>MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-44861423516562414792014-04-13T13:12:00.003-06:002014-04-13T13:12:50.894-06:00Heartbleed Commit<p>Here is a link to the <a href="http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=ssl/t1_lib.c;h=c5c805cce286d12d81c5fdccfe9173d61a68ee82;hb=4817504d069b4c5082161b02a22116ad75f822b1#l2407">commit</a> that introduced the Heartbleed OpenSSL Vulnerability.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-87347748360946140132014-04-10T17:06:00.001-06:002014-04-10T17:06:13.494-06:00Eclipse OSGI - The Windows Virus<p>When Eclipse OSGI starts up, it checks to see whether certain locations are writable. Based on what it finds, it chooses locations for the configuration area, the user base directory, etc. What is "interesting" is the way that it determines whether a location is writable. Here is an excerpt from <a href="http://grepcode.com/file/repository.grepcode.com/java/eclipse.org/3.7.2/org.eclipse/osgi/3.7.2/org/eclipse/osgi/internal/baseadaptor/AdaptorUtil.java#208">AdaptorUtil.java</a>:</p>
<blockquote><code><pre>File fileTest = null;
try {
// we use the .dll suffix to properly test on Vista virtual directories
// on Vista you are not allowed to write executable files on virtual directories like "Program Files"
fileTest = File.createTempFile("writtableArea", ".dll", installDir); //$NON-NLS-1$ //$NON-NLS-2$
} catch (IOException e) {
//If an exception occured while trying to create the file, it means that it is not writtable
return false;
} finally {
if (fileTest != null)
fileTest.delete();
}
return true;</pre></code></blockquote>
<p>The code attempts to create a file named <em>writtableArea.dll</em> in the target directory. This sets off alarm bells on many virus scanners. This came to my attention because exactly one of users reported that our RCP application would simply disappear after presenting the splash screen. They were using Kapersky with particularly draconian settings.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-28208123562310553022014-01-08T15:23:00.003-07:002014-01-08T15:23:49.922-07:00Eclipse RCP vs. Changing a Perspective Name<p>Using Eclipse 3.7.2 or 4.3.1, I ran into a problem where I wanted to change the name of the perspective in an RCP application. I changed the title in the <em>bundle.properties</em> file and relaunched my application from within the Eclipse IDE. The title had not changed. My immediate thought was that the title was cached in the application workspace data. So I relaunched the application after changing the launch configuration to clear that data first. The old label remained. I cleaned and rebuilt the application projects. I exited and re-entered Eclipse. I shook my fist in frustration. None of these would make the label change.</p>
<p>A scan of all files within the IDE <em>.metadata</em> directly yielded a couple of binary files that contained the old string. After some head-scratching, it was determined that these files were in the application's configuration area. This can be cleared in the launch configuration by switching to the <em>Configuration</em> tab and ticking the <em>Clear the configuration area before launching</em> checkbox. After that, the new label took effect.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-88810342255498167482013-10-08T09:50:00.001-06:002013-10-08T10:38:19.637-06:00Inspecting the SQL Server Plan Cache<p>While investigating an insidious case of SQL Server query plan cache poisoning, I looked for a way to see the cached plans. Here is what I used:</p>
<blockquote><code><pre>WITH data AS (
SELECT
c.usecounts AS uses
, s.max_worker_time AS max_time
, CAST(
p.query_plan.query('
declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/showplan";
for $d in distinct-values(//ColumnReference/@Database)
order by $d
return data($d)
') AS NVARCHAR(MAX)) AS databases
, CAST(
p.query_plan.query('
declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/showplan";
for $t in distinct-values(//ColumnReference/@Table)
order by $t
return data($t)
') AS NVARCHAR(MAX)) AS tables
, c.objtype
, t.text AS sql
, p.query_plan
, s.last_execution_time AS executed
, s.execution_count AS runs
, s.creation_time AS created
, c.plan_handle
, s.sql_handle
, s.query_hash
, s.query_plan_hash
FROM sys.dm_exec_cached_plans AS c
LEFT JOIN sys.dm_exec_query_stats AS s
ON s.plan_handle = c.plan_handle
CROSS APPLY sys.dm_exec_sql_text(c.plan_handle) AS t
CROSS APPLY sys.dm_exec_query_plan(c.plan_handle) AS p
)
SELECT TOP 100 *
FROM data
ORDER BY max_time DESC</pre></code></blockquote>
<p>Note the final two hash columns. Their presence is to help identify when different plans have been cached for a single SQL statement that has been executed with different parameter values.</p>
<p>Cached plans can be cleared using <code>DBCC FREEPROCCACHE</code> or <code>DBCC FREESYSTEMCACHE</code>. When benchmarking SQL, it can be useful to discard SQL Server's internal buffers. This can be done without restarting the server thus:</p>
<blockquote><code><pre>CHECKPOINT
GO
DBCC DROPCLEANBUFFERS</pre></code></blockquote>
<p>On a related note, the query hint <a href="http://blogs.msdn.com/b/sqlprogrammability/archive/2008/11/26/optimize-for-unknown-a-little-known-sql-server-2008-feature.aspx"><code>OPTIMIZE FOR UNKNOWN</code></a> can be used to essentially eliminate parameter-sniffing -- reducing non-deterministic optimizer behaviour.</p>MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-87764838877518450122013-08-09T13:27:00.000-06:002013-08-09T17:36:20.067-06:00Language Constructs<p><em>C</em> programmers, whose language maps directly to machine instructions, scoff at the bloat, tediousness and inefficiencies of ultra-high-level languages like <em>C++</em>.</p>
<p><em>Java</em> programmers, whose language constructs map directly to the output of <code>paste -d '\n' <(man cobol) <(man simula)</code>, rejoice that they do not have to deal with the bit-twiddling and low-level trivia of machine languages like <em>C++</em>.</p>
<p><em>LISP</em> programmers, whose language <strike>maps directly to</strike> is the metaphysical substrate of the cosmos, are confused by these opinions since they see no distinction between high-level and low-level code.</p>
<p><em>Perl</em> programmers dismiss the whole debate as academic since their language maps directly to the structure of a human mind (named <em>Larry</em>).</p>
<p>:)</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-14185604194858372692013-06-04T11:49:00.000-06:002013-06-04T11:50:52.610-06:00Using Internal Java Implementation Classes<p>The build of our Java 7 application failed today. Someone had referenced an internal Java implementation class within our code, namely <code>com.sun.xml.internal.txw2.output.IndentingXMLStreamWriter</code>. We use Eclipse, and this reference compiled and ran fine within the IDE. However, the PDE build would fail saying <em>cannot find symbol</em>. The difference can be explained when one realizes that the IDE compiles using the Eclipse compiler, but the PDE build uses <em>javac</em>.
</p>
<p>However, one might still be stumped as <code>IndentingXMLStreamWriter</code> can be found within <em>rt.jar</em>. It turns out that <em>javac</em> does not actually compile directly against <em>rt.jar</em>. Instead, it references the file <em>lib/ct.sym</em>. This file is essentially a zip archive containing all of the "blessed" classes. <code>IndentingXMLStreamWriter</code> is not in that file.</p>
<p>It is best to avoid using internal implementation classes, but there is a work-around if one insists. The undocumented compiler argument <code>-XDignore.symbol.file</code> will use <em>rt.jar</em> directly.</p>
<p>The details of this mechanism can be found by inspecting the OpenJDK source files <a href="http://hg.openjdk.java.net/jdk7/jdk7/langtools/log/tip/src/share/classes/com/sun/tools/javac/main/RecognizedOptions.java">com.sun.tools.javac.main.RecognizedOptions</a> and <a href="http://hg.openjdk.java.net/jdk7/jdk7/langtools/log/tip/src/share/classes/com/sun/tools/javac/file/JavacFileManager.java">com.sun.tools.javac.file.JavacFileManager</a>.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-9097968240941710172013-05-10T12:10:00.001-06:002013-05-10T16:16:43.061-06:00Port Redirection Under Windows 7<p>Today I wanted to perform some benchmarking of an SQL application. All of the work up to this
point was on a single machine that hosted both the database and the application. As a result, SQL network traffic was flowing through the Loopback interface. I wanted to get a better idea of the effects of real network latency by putting the bytes out on the wire. I was too lazy to install the software or the database on another machine, so I wanted to set up a port reflector on a second Windows 7 machine.
</p>
<p>First, I tried using netcat. Well, strictly speaking it was the nmap version, <a href="http://nmap.org/ncat/">ncat</a>:
<blockquote><code><pre>
ncat -l -p 9999 --keep-open -c "ncat win7host 1433"</pre></code></blockquote>
<p>This listens on the local port 9999 and feeds every incoming request to a new ncat process that sends the incoming packets to the target host on the SQL Server port 1433. This worked, and introduced network latency. A <em>lot</em> of latency. The processes being created for each connection introduced so much delay that throughput was similar to Internet latency. That was a little too much latency for my purposes.</p>
<p>As an alternative, I used the Windows netsh command:</p>
<blockquote><code><pre>netsh interface portproxy add v4tov4 listenport=9999 listenaddress=192.168.0.158 connectport=1433 connectaddress=win7host</pre></code></blockquote>
<p>This performed without any significant overhead introduced by the proxy itself. The port redirection was removed using:<p>
<blockquote><code><pre>netsh interface portproxy delete v4tov4 listenport=9999 listenaddress=192.168.0.158</pre></code></blockquote>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-68734391554977933462013-01-30T17:32:00.000-07:002013-01-31T11:18:39.597-07:00Java SAXParserFactory vs. XMLFilter<p>Using Java 7, I had problems trying to inject an XMLFilter into a SAX pipeline. My code
looked something like this:</p>
<blockquote><code><pre>SAXParserFactory saxFactory = SAXParserFactory.newInstance();
saxFactory.setNamespaceAware(true);
saxFactory.setValidating(false);
saxFactory.setSchema(schema);
XMLReader reader = saxFactory.newSAXParser().getXMLReader();
XMLFilter filter = new NamespaceRewriter();
filter.setParent(reader);
filter.setContentHandler(handler);
filter.parse(source);</pre></code></blockquote>
<p>The filter was a trivial extension of <em>XMLFilterImpl</em> that rewrite some of the document namespaces.
However, this code would fail. As can be seen, the <em>XMLReader</em> was configured to perform XML schema validation. The failure was that the reader would bypass the installed SAX filter when performing this validation. Using a debugger, I was able to verify at run-time the reader's <em>ContentHandler</em> was
correctly set to be my filter. But the reader was apparently performing the schema validation <em>prior</em> prior to invoking its <em>ContentHandler</em>.</p>
<p>I was able to work around this problem by inserting an explicit validation stage in the pipeline instead
of relying upon the validation baked into the <em>XMLReader</em> built by <em>SAXParserFactory</em>:</p>
<blockquote><code><pre>XMLReader reader = XMLReaderFactory.createXMLReader();
<strong>ValidatorHandler validator = schema.newValidatorHandler();</strong>
XMLFilter filter = new NamespaceRewriter();
filter.setParent(reader);
<strong>filter.setContentHandler(validator);</strong>
<strong>validator.setContentHandler(handler);</strong>
filter.parse(source);</pre></code></blockquote>
<p>The changes are <code><strong>emphasized</strong></code>.</p>
<p>I would say that this is unfortunate behaviour in the Java 7 implementation, but I lay the blame squarely
on the SAX API. The SAX API makes it devilishly difficult to build reader/filter chains correctly. In
particular, it is almost impossible to link independently assembled subchains into a single larger chain.
The API supports easy delegation of <em>parse</em> calls, but chaining content handlers is messy. A
filter is required to destructively replace the content handler of the next reader/filter
in the chain with an augmented content handler. If that next component has built-in behaviour that is
not expressed in the content handler, then that component cannot be wrapped successfully (e.g. the
problem at hand). Even if the XMLReader above had expressed the schema validation as an augmentation to
the content handler, how would a filter recover it? Presumably <em>getContentHandler()</em> is supposed
to return the unaugmented handler -- so how is one supposed to get the augmented one?</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-66410577473364872442013-01-10T15:39:00.002-07:002013-01-10T15:46:54.939-07:00SQL Server vs. EXISTS<p>Using Microsoft SQL Server, I had a query that retrieved rows from a view
of some complexity, i.e. something like this:</p>
<blockquote><code><pre>SELECT * FROM vwComplex WHERE ...</pre></code></blockquote>
<p>This query ran basically instantaneously, even though the view involved numerous UNION ALLs and joins on coalesced keys. My sample query criteria selected ~100 out of the view's ~80,000 rows.</p>
<p>In the general case, I needed a query that returned a single bit indicating whether the specific criteria will select at least one row. I tried this:</p>
<blockquote><code><pre>SELECT
CASE WHEN EXISTS (
SELECT * FROM vwComplex WHERE ...
) THEN 1 ELSE 0 END</pre></code></blockquote>
<p>I presumed that this statement would run faster than the original unadorned query since it can stop executing as soon as it finds the first row. My presumption was incorrect -- the query ran for a long, <i>long</i> time.</p>
<p>In this case, at least, it turns out that it is faster to run the full query than to perform an existence check:</p>
<blockquote><code><pre></pre>SELECT SIGN(COUNT(*)) FROM vwComplex WHERE ...</code></blockquote>
<p>No surprises here, of course. SQL query plan generation remains a mystical exercise.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-64802023458673384312012-12-18T13:55:00.000-07:002012-12-18T13:55:07.462-07:00SQL Server Schema Differences<p>The following SQL statement will generate a crude list of column differences between two SQL Server databases:</p>
<blockquote><code><pre>DECLARE @oldDatabase NVARCHAR(MAX)
DECLARE @newDatabase NVARCHAR(MAX)
SET @oldDatabase = 'unittestnew_3_14_x'
SET @newDatabase = 'unittestnew_3_15_x'
DECLARE @sql NVARCHAR(MAX)
SET @sql = '
WITH
oldColumns AS (
SELECT
tab.name AS tableName
, ROW_NUMBER() OVER (PARTITION BY tab.name ORDER BY col.column_id) AS columnSeq
, col.name AS columnName
, typ.name AS type
, col.max_length AS length
, col.scale AS scale
FROM ' + @oldDatabase + '.sys.all_columns AS col
INNER JOIN ' + @oldDatabase + '.sys.all_objects AS tab
ON tab.object_id = col.object_id
AND tab.type = ''U''
INNER JOIN ' + @oldDatabase + '.sys.schemas AS sch
ON sch.schema_id = tab.schema_id
AND sch.name = ''dbo''
INNER JOIN ' + @oldDatabase + '.sys.systypes AS typ
ON typ.xusertype = col.system_type_id
)
, newColumns AS (
SELECT
tab.name AS tableName
, ROW_NUMBER() OVER (PARTITION BY tab.name ORDER BY col.column_id) AS columnSeq
, col.name AS columnName
, typ.name AS type
, col.max_length AS length
, col.scale AS scale
FROM ' + @newDatabase + '.sys.all_columns AS col
INNER JOIN ' + @newDatabase + '.sys.all_objects AS tab
ON tab.object_id = col.object_id
AND tab.type = ''U''
INNER JOIN ' + @newDatabase + '.sys.schemas AS sch
ON sch.schema_id = tab.schema_id
AND sch.name = ''dbo''
INNER JOIN ' + @newDatabase + '.sys.systypes AS typ
ON typ.xusertype = col.system_type_id
)
, matched AS (
SELECT
COALESCE(old.tableName, new.tableName) AS tableName
, old.columnSeq AS oldColumnSeq
, old.columnName AS oldColumnName
, old.type AS oldType
, old.length AS oldLength
, old.scale AS oldScale
, new.columnSeq AS newColumnSeq
, new.columnName AS newColumnName
, new.type AS newType
, new.length AS newLength
, new.scale AS newScale
FROM oldColumns AS old
FULL OUTER JOIN newColumns AS new
ON new.tableName = old.tableName
AND new.columnName = old.columnName
)
SELECT
tableName
, oldColumnName
, oldType
, oldLength
, oldScale
, CASE
WHEN oldColumnName IS NULL THEN ''added''
WHEN newColumnName IS NULL THEN ''deleted''
ELSE ''changed''
END AS diff
, newColumnName
, newType
, newLength
, newScale
FROM matched
WHERE oldColumnName IS NULL
OR newColumnName IS NULL
OR oldColumnName <> newColumnName
OR oldType <> newType
OR oldLength <> newLength
OR oldScale <> newScale
ORDER BY tableName
, COALESCE(oldColumnSeq, newColumnSeq)
'
EXEC sp_executesql @sql</pre></code></blockquote>MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-48338080421518920792012-12-13T10:52:00.001-07:002013-02-09T22:04:12.883-07:00Java vs. UTF-8 BOM<p>From time-to-time, one runs into Java software that has trouble handling a Unicode byte-order-mark (BOM) in a UTF8-encoded character stream. I've seen this, for example, in various XML processing pipelines. Most recently, I came across it in a character stream generated by a .NET component that was being consumed by a Java component. The problem is always the same -- the BOM is finding its way into the data stream instead of being silently consumed by the Java I/O infrastructure.</p>
<p>This is a known problem, listed as <a href="http://bugs.sun.com/view_bug.do?bug_id=4508058">bug #4508058</a> on the Java bug parade. Oracle/Sun acknowledged the bug, and it was even briefly fixed in the "Mustang" (Java6) release. However, a follow-on bug report (<a href="http://bugs.sun.com/view_bug.do?bug_id=6378911">#6378911</a>) complained that the fix broke backwards compatibility with previous releases. So, the fix was ultimately withdrawn and the original bug was marked as "won't fix".</p>
<p>Bottom line: when writing Java components that consume UTF8-encoded character streams, be prepared to consume the BOM yourself. Also, be aware that the various Microsoft I/O frameworks aggressively write BOMs into UTF8 streams, even if the stream would be otherwise empty.</p>
<p>It seems to me that there is an opportunity here for Java to add an alternate method to construct UTF8-readers that handle BOMs properly.</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-50533370348442749252012-12-12T14:49:00.000-07:002012-12-12T16:26:14.970-07:00curl vs. Windows vs. esi.manage measure export<p>A self-contained Windows command file can use <em>curl</em> to invoke the <em>esi.manage</em> measure export integration service through judicious use of newline and quote escaping:</p>
<blockquote><code><pre>@ECHO OFF
SETLOCAL
SET server=127.0.0.1:8080
SET request=^
-h ^
-g PROJECT.NAME ^
-c none -m PRODUCTION -b GROSS -v FORECAST -U Oil:boe -D NONE --curtailment ^
-c scheduled -m PRODUCTION -b GROSS -v FORECAST -U Oil:boe -D SCHEDULED --curtailment ^
-c unscheduled -m PRODUCTION -b GROSS -v FORECAST -U Oil:boe -D UNSCHEDULED --curtailment ^
-c all -m PRODUCTION -b GROSS -v FORECAST -U Oil:boe -D ALL --curtailment ^
2010-01-01 ^
18 M ^
\"origin subset\"
curl ^
--ntlm ^
--user : ^
-X POST ^
-H "content-type:text/plain" ^
--data-binary "%request%" ^
http://%server%/3esi/manage/integration/export
</pre></code></blockquote>
<p>The use of the <code>--data-binary</code> option is important since <code>-d</code> or <code>--data-ascii</code> will reformat the data (e.g. strip newlines).</p>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-63727042194120862172012-10-05T13:25:00.001-06:002012-10-05T13:26:42.370-06:00Serializing Run-Time Types with gson<p><a href="http://code.google.com/p/google-gson/">gson</a> is a JSON serialization libary for Java. It is optimized for round-trip serialization and deserialization between Java and JSON representations of objects. To support such round trips, the serializer operates upon the static types of an object tree. As a result, the dynamic (run-time) types of objects are generally ignored.</p>
<p>This can be inconvenient for simple, output-only serialization scenarios. Consider what happens if we try to serialize, say, a <code>List<MyClass></code>. When gson serializes this list, the output will include each element's fields from the declared type (<code>MyClass</code>). Fields from each element's run-time type will <em>not</em> be serialized. If <code>MyClass</code> is an interface, then there are no declared fields to serialize. Such elements will always be serialized as an empty JSON object (<code>{}</code>).</p>
<p>One way to work around this is to implement a <code>TypeAdapterFactory</code> that handles all interface types:</p>
<blockquote><code><pre>final class MyAdapterFactory implements TypeAdapterFactory {
@Override
public <T> TypeAdapter<T> create(final Gson gson, TypeToken<T> type) {
Class<? super T> rawType = type.getRawType();
if (rawType.isInterface()) {
return new TypeAdapter<T>() {
@Override
public void write(JsonWriter out, T value) {
Type valueType = null == value ? Object.class : value.getClass();
gson.toJson(value, valueType, out);
}
@Override
public T read(JsonReader in) {
// we only care about serialization in this scenario
throw new IllegalStateException();
}
};
}
return null;
}
}</pre></code></blockquote>
<p>This adapter factory will delegate serialization of an object of any interface class to a custom <code>TypeAdapter</code> that inspects the run-time type of the object.</p>
<p>The factory is installed using <code>GsonBuilder</code>:</p>
<blockquote><code><pre>GsonBuilder gsonBuilder = new GsonBuilder();
gsonBuilder.registerTypeAdapterFactory(new MyAdapterFactory());
// ...
Gson gson = gsonBuilder.create();</pre></code></blockquote>
MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.comtag:blogger.com,1999:blog-4032531870936831330.post-57155797887731819712012-09-17T17:28:00.000-06:002012-09-17T17:32:28.273-06:00How Not to Use For in Mathematica<p>Someone was asking about this the other day...</p>
<p>StackOverflow once had the following question under the Mathematica tag:</p>
<blockquote><em>How could I calculate the trace of a n*n matrix, with a for loop? I know that the trace is the sum of the terms on the diagonal, but I haven't got a clue how to do it...</em></blockquote>
<p>The question has been deleted now, but before that happened I couldn't resist the devilish temptation to give an impertinent answer. For those who can view deleted questions on StackOverflow, the page <a href="http://stackoverflow.com/questions/8534793/wolfram-mathematica-7-matrix">still exists</a>. For those who can't, my response is quoted below...</p>
<blockquote>
<p>Let's start with a 5x5 matrix, <code>m</code>:</p>
<pre><code>m = Partition[Range[25], 5];
m // MatrixForm
</code></pre>
<p>which looks like this:</p>
<pre><code> 1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
</code></pre>
<p>In Mathematica, there are many ways use <code>For</code> to compute the trace. Here is a simple way:</p>
<pre><code>For[tr=0; i=0, i<10000, ++i, tr+=RandomInteger[10]; tr]; Tr[m]
</code></pre>
<p>More efficient ways include:</p>
<pre><code>Unevaluated[For[tr=0; i=1, i <= Dimensions[m][[1]], ++i, tr += m[[i,i]]]; tr] /.
_For :> (tr = Tr[m])
</code></pre>
<p>or</p>
<pre><code>For[,True,,Return[Tr[m], For]]
</code></pre>
<p>or</p>
<pre><code>For[,False,,] /. Null -> Tr[m]
</code></pre>
<p>;)</p>
</blockquote>MMhttp://www.blogger.com/profile/05886743369379676368noreply@blogger.com