Monday, December 28, 2009

Friday, December 18, 2009

C input routine for single character and string

By answering some beginner questions in the programming forum, I am trying to learn some C/C++ again. As a matter of course, I learned something new as always, this is a page I referred to when trying to clear the buffer in stdin.

The first example is for capturing single character and check if it is a number between 1 and 4:

char option;
int number;
do {
puts("Please enter your choice:");
fflush(stdout);
if (fgets(&option, 2, stdin) != NULL) {
if (option != '\n') {
scanf("%*[^\n]"); // get rid of the non-newline characters
scanf("%*c"); // get rid of the newline character
}
}
} while (!(sscanf(&option, "%d", &number) == 1 &&
number >= 1 && number <= 4));


The 2nd example is for capturing a string with trimming, overflow protection and emptyness checking:


char fileStr[20];
char *pointer = fileStr;
bool tooLong = false;

do {
printf("\nPlease input a file name: ");
fflush(stdout);
if (fgets(fileStr, sizeof fileStr, stdin) != NULL) {
tooLong = false;
if (*fileStr != '\n') {
// search for newline character
char *newline = strchr(fileStr, '\n');
if (newline != NULL) {
*newline = '\0'; /* overwrite trailing newline */
pointer = trimwhitespace(pointer); /* trim the line */
} else {
/* clear the stdin since user input too much */
tooLong = true;
scanf("%*[^\n]");
scanf("%*c");
}
}
}
} while (tooLong || *pointer == '\0' || *pointer == '\n');

printf("file name = \"%s\"\n", pointer);


Pay special attention on the variable you are gonna use at last, it is "pointer".

Tuesday, December 08, 2009

Birt and java.lang.NoClassDefFoundError: org/w3c/tidy/Tidy Tidy.jar

Lately I have been debugging a customer issue with our birt integration. The exception message is java.lang.NoClassDefFoundError: org/w3c/tidy/Tidy. We have found a lot of posts in google with this error.

Most of the google results and this one are related to the file/folder permission, and the location of the jars in tomcat/websphere. Our integration is not involved with any webapp container, and we double check few times the file/folder permission is okay.

And then, I know that lsof can show what jars the java process has linked. That shows all the jars are marked as "deleted". That is a big hint to me. Something is wrong with the java process and this time, we found that, the same java process had been launched twice. This is a good lsof tutorial btw.

Friday, December 04, 2009

Javascript window.event.keyCode in firefox

This is usually used for capturing the ENTER key pressed in the html text box:
function onkeypressed(e) {
var keyCode = (window.event) ? window.event.keyCode : e.which;
if (keyCode == 13) {
// do something
return true;
} else {
return false;
}
}

Monday, November 02, 2009

Use regular expression to extract / parse a string into a collection of matched results recursively

Helped a person in doing an assignment and almost forgot a routine that I love in parsing a string into an ArrayList of matched results with a regular expression. Here it is:

Pattern p = Pattern.compile(...);
Matcher m = p.matcher(inputString);
List l = new ArrayList();
for (int i = 0; m.find(i); i = m.end()) {
l.add(m.group(index)); // depends on your regular expression grouping
}

Thursday, October 01, 2009

PHP 5.2.10 session ID (sessionId) in URL problem in Solaris (all ZEROes 0000 as the year of the set-cookies)

Wow, I have actually spent 4 hours in looking into what's wrong with this session ID in url problem in a solaris box.

My journey started as turning on all the php log, no luck, nothing in error.log when the problem happened.

Then, I went to play around with all the 0/1 settings in session.* (actually use_only_cookies, use_trans_sid), no luck either.

And then, I went to compare a working windows setup with this solaris setup in php_info()... nothing special there.

I was starting to believe there was something wrong with the cookie, so I tried to use php to setCookie with a timeout. Using the firefox, I found out, the cookie from the solaris apache was not obtained in firefox. Therefore, I started to search google for "apache cookie" and NO, I wasted another hour.

Hmm it seems cookie is actually set in header. Then I used "curl -i" to dump the solaris header, and finally noticed the difference. The year of the expiration date is ALL ZEROes. Confirmed with a cookie WITHOUT timeout, the cookie started to appear in firefox for windows and solaris apache/php.

Googled some more and finally got the correct query: "set-cookie solaris 0000 year", the first result is the answer. Wow, I almost gave up on the way... Created this post for more google result matches.

Thursday, September 24, 2009

Two links for future reference - Spring & inner class and access target from proxy in aop

First link is here, you can create a bean with inner class with a constructor back to the parent class bean or you will get an exception something like no default constructor.

Second link is here, it is basic I know, just something I will most likely forget in the future if I need this again.

Scala 2.8 scala.io.Source throws java.nio.charset.UnmappableCharacterException at an unmappable sequence of bytes by default

Using scala 2.8 to grab a webpage encoded in big5, Source.fromURL throws UnmappableCharacterException at a chinese character (in bytes) that cannot be mapped to the unicode character. The default behavior of the scala Codec is to report this exception.

From reading the Codec source, you could see that Codec is actually composed of java.nio.charset.CharsetDecoder. From reading the javadoc, there is a caller method onUnmappableCharacter, and there should be 3 different CodingErrorAction that you can choose.

In scala.io.Codec source:

def onUnmappableCharacter(newAction: Action): this.type = { _onUnmappableCharacter = newAction ; this }

So that's easy enough,

import java.nio.charset.CodingErrorAction.REPLACE
implicit def codec = Codec("big5").onUnmappableCharacter(REPLACE)
scala.io.Source.fromURL(...) // a big5 encoded page with unmappable

Then everything should go quietly without any error since the unmappable sequence will be replaced by the default value.

Wednesday, September 16, 2009

Python in cygwin with puttycyg or mintty - interactive mode without prompt

I finally find the answer at here the last post under that thread.

$ python -i
Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

Yay!

Tuesday, September 15, 2009

Python 3 Unicode - print() in a putty cygwin terminal with UTF8 enabled

When trying to rewrite my hkgolden forum stat program by using Python 3, the unicode issue was my first thing to deal with. Using a putty cygwin terminal to launch the program and try to print the web page content to the terminal (UTF8 enabled), I immediately encountered two problems: 1. the chinese is not chinese anymore, and 2. "UnicodeEncodeError: 'gbk' codec can't encode character '\u2022' in position 188: illegal multibyte sequence".

The reason why the chinese cannot be shown correctly because the print() will automatically pick up some default encoding from the terminal/os even you have written "print()" would fail since you could guess from problem #2, the default for my system is "gbk" as I have picked "Simplified chinese" as my non-Unicode encoding in Windows (print(sys.stdout.encoding) returns 'cp936' for me).

After three hours of reading the reference python doc, and googling, I figured out how to bypass the print() with sys.stdout.buffer.write(), this method is for outputting the bytes directly to stdout.

sys.stdout.buffer.write(line.decode("big5").encode())

'line' was in big5 encoded bytes
line.decode("big5") makes the bytes to an unicode string in Python 3
line.decode("big5").encode() will make the unicode string to utf8 encoded bytes

More research on this sys.stdout.buffer.write led me to the better answers at "Setting the correct encoding when piping stdout in python" and here.


import sys
import codecs
sys.stdout = codecs.getwriter('utf8')(sys.stdout.buffer)
print(line.decode("big5")) # automatically using utf8 to output the unicode string


From http://www.python.org/doc/3.1/library/codecs.html#codecs.StreamWriter, stream must be a file-like object open for writing binary data, and that's our "sys.stdout.buffer".

The best way to deal with unicode is to, treat every input as bytes and decode it (say for our example, it is big5) after receiving the input; send every output as bytes by encoding the internal string representation (same as in perl), the best choice is utf8 here.

Thursday, September 10, 2009

Scala scala.io.Source fromURL blocks / hangs forever without timeout value

Recently I am doing a scala project which is trying to data mine a forum. I have reached to a point that, since I am using multi threads to do the web content fetching, some of my threads block/hang forever at various lines, like Source.getLine, hasNext or even fromURL. Here is one example of the thread dump stack:


"pool-1-thread-194" prio=6 tid=0x0b57d000 nid=0x1710 runnable [0x0f4af000..0x0f4afa94]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0x031a5488> (a java.io.BufferedInputStream)
at sun.net.www.MeteredStream.read(MeteredStream.java:116)
- locked <0x031efdc8> (a sun.net.www.http.KeepAliveStream)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2446)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
- locked <0x031efe48> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.read(BufferedReader.java:157)
- locked <0x031efe48> (a java.io.InputStreamReader)
at scala.io.BufferedSource$$anonfun$1$$anonfun$apply$1.apply(BufferedSource.scala:29)
at scala.io.BufferedSource$$anonfun$1$$anonfun$apply$1.apply(BufferedSource.scala:29)
at scala.io.Codec.wrap(Codec.scala:65)
at scala.io.BufferedSource$$anonfun$1.apply(BufferedSource.scala:29)
at scala.io.BufferedSource$$anonfun$1.apply(BufferedSource.scala:29)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:146)
at scala.collection.Iterator$$anon$1.next(Iterator.scala:712)
at scala.collection.Iterator$$anon$1.head(Iterator.scala:699)
at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:374)
at scala.collection.Iterator$$anon$17.hasNext(Iterator.scala:319)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:706)
at scala.io.Source$LineIterator.getc(Source.scala:182)
at scala.io.Source$LineIterator.next(Source.scala:195)
at scala.io.Source$LineIterator.next(Source.scala:165)
at scala.io.Source.getLine(Source.scala:163)


With debugger on, you should be able to figure out the timeout parameter of the socketRead0 method is actually ZERO. That's why fromURL will block forever.

Open up the scala.io.Source source (2.8), fromURL is actually a convenience method to fromInputStream(url.openStream())(codec).

Now that's easy, just forget about the fromURL method. Use the fromInputStream instead with java.net.URLConnection.


import java.net.URL
import scala.io.Source

val timeout = 60000
val conn = (new URL(url)).openConnection()
conn.setConnectTimeout(timeout)
conn.setReadTimeout(timeout)
val inputStream = conn.getInputStream()

val src = Source.fromInputStream(inputStream,
Source.DefaultBufSize,
null,
() => inputStream.close())


EDIT: forgot to close the stream!

Saturday, August 29, 2009

Perl Unicode

Recently I was struggling with the unicode in Perl, there are a few things I found it is tricky:

From here, there is a clear explanation on the encode/decode functions. If my perl program is a command line program which accepts cp950 arguments in a chinese windows, the arguments need to be "decoded" into Perl's internal form.

for(my $i=0; $i < scalar(@ARGV); $i++) {
$ARGV[$i] = Encode::Byte::decode("cp950", $ARGV[$i]);
}

When you send the arguments to a server that accepts UTF8, you need to send

$data =~ s/(\P{IsASCII})/sprintf('%2x;',ord($1))/eg;

After getting the server response, if it is in UTF8, you need to decode the response one more into Perl's internal form.

$response = Encode::decode_utf8($response);

So, the last step is, we need to change the Perl's internal form to cp950 for the STDOUT.

binmode STDOUT, ":cp950";

This is all from my memory if there is any mistake.

Tuesday, August 04, 2009

Processing EDI, XML, CSV and more with Smooks

Getting this headline in TSS, and remember it was used to be a tedious task to handle that in my previous company. Posting this for future reference.

http://www.theserverside.com/news/thread.tss?thread_id=55339

Monday, August 03, 2009

Windows Gadget experience

Recently I have finished a windows gadget that uses ajax to grab the data. The examples I followed were the official gadgets from Microsoft: Stocks and Feed Headlines.

As being told in several sites and books, Windows Gadget is composed of html, css and javascript. It wasn't that easy at the first glance on the stocks gadget. Drilling down the code to see how the data is parsed from the web led to an ActiveX object with an dll. I googled this and found out it is really the case: http://www.cnblogs.com/yayx/archive/2007/09/04/881879.html (a chinese webpage however). Anyway, another problem with the stocks javascript is that, it's all structured as static class way, click here for the difference on what javascript you have normally seen in the past. Additionally, the number of the code lines is horrible too, >4000 lines doesnt help, the reason is that basically the author put all the logic INCLUDING all the html creation like "tables,TDs" in the javascript instead of in the html layer.

Anyway, the very first issue was to try to put the data retrieval to something that can be easily tested. Replaced ActiveX with ajax with the helps from these pages:

http://developer.novell.com/wiki/index.php/Using_the_XMLHttpRequest_object
http://www.ibm.com/developerworks/web/library/wa-ajaxintro2/
http://www.jibbering.com/2002/4/httprequest.html

The way to link the server code with the UI code is by using the listener pattern, you just need to define the functions that you want to run into the server code listeners, once the data is retrieved OR the status message from the connection is changed, the listener functions will be triggered, and your UI will be updated.

12029 is the status code that indicates "unable to establish connection".
setTimeout() and setInterval() are the functions to constantly perform a function.
Reusing the XMLHTTP Object is described here.

Once the server communication code was ready, I was trying to code the same way as in stocks but failed with the elements positioning. On the contrary, feed headlines is a much better example to deal with. The author defined the tables in the html, defined a number of methods to trigger by following the html trigger like onmouseout, onwheelmove etc.

There is a function from the web that can transform the milliseconds to human readable date in javascript. Watch out the "var day = Math.floor(hr/60)", it should be divided by 24?!

At last, two more tricky places that wasted me few hours of debugging:
1. I have tried the gadget, and it only worked on one server, it didnt connect to other server and the XMLHttpRequest status code was zero and empty XML response. The only way to make this problem disappear was to kill the sidebar.exe process and start it over.

2. In the setting.js in the feed headlines gadget, loadSettings(); in the load() seemed to be duplicated by my eyes at the first place, after several hours of struggling why the saved settings cannot be successfully passed to the setting UI, this line was the missing piece.

Overall, the experience was great, and I am really satisfied with the results.

Saturday, July 18, 2009

Simple 2.1

Just found out http://simple.sourceforge.net/ as a XML/Object framework, it is quite similar to the inhouse custom library built by my CTO. Writing this as a reference to me just in case I need it in the future.

Wednesday, July 08, 2009

Unicode in Java

Today I found out the jvm system properties file.encoding needs to be set as UTF-8 in non-English windows to work properly with an utf-8 configured MySQL (that's with DEFAULT CHARSET=utf8 in CREATE TABLE).

-Dfile.encoding=UTF-8


In traditional chinese windows, the default code page is ms950, while it is windows-1252 for my local English windows setup.

Monday, June 29, 2009

PHP 500 Internal Server Error

Today I encountered an error message in the integration test with "500 Internal Server Error". I have wasted an hour in searching the "Client-Warning: Redirect loop detected" in google and playing around with the require statement etc.

The easiest way should be just turning on the error logging in php.ini, log_error = On. The error message in the log told me the require statement couldn't find the target php file. That's it...

PHP Fatal error:  require_once() [function.require]: Failed opening required 'utilities.php' (include_path='/opt/ecloud/i686_Linux/php/lib/php') in /opt/ecloud/i686_Linux/apache/htdocs/accelerator/evalPhp.php(18) : eval()'d code on line 2

Friday, June 26, 2009

Jetty Handler with NIO and Continuation

After getting the http or https request through SelectChannelConnector or SslSelectChannelConnector, the custom handler method that you extend from AbstractHandler handle(...) will run.
public void handle(String target,
HttpServletRequest request,
HttpServletResponse response,
int dispatch)
throws IOException,
ServletException {

// Obtain Jetty continuation
Continuation continuation = getContinuation(request, null);

// Create a custom callback and store it in each HttpServletRequest
// This callback is to wrap the continuation object for the other thread
// to call continuation.resume() for generating the response.
// "callback" should be a final static variable in production server.
Callback callback = (Callback) request.getAttribute("callback");
if (callback == null) {
callback = new Callback(continuation);
request.setAttribute("callback", callback);
}

// Synchronize on callback to prevent continuation.resume() from
// happening before continuation.suspend().
synchronized (callback) {
if (continuation.isNew()) {
// Dispatch the request to another area with different thread
// and of course, callback must be referenced later to call
// continuation.setObject() and continuation.resume() down the road.
}
// zero here for the simplicity
continuation.suspend(0);
}

// Up to this point, the continuation is resumed, and got the object ready
// for response.
PrintWriter out = null;
try {
out = response.getWriter();
Object obj = continuation.getObject();
// further processing the obj for the "out"
} finally {
if (out != null) {
out.close();
}

// Reset the continuation
continuation.reset();
continuation.setObject(null);
}
}

Wednesday, June 24, 2009

MySQL alter table with multiple indexes

Recently I need to add/delete multiple indexes in the same table. I didn't notice that, I can chain the add/drop index in a single "ALTER TABLE" which only does one table copying once instead of multiple times.

http://brian.moonspot.net/mysql-alter-multiple-things

In the past, I tried to copy over the huge dataset to a temp table with the new indexes created, but I encountered a problem in deleting a FK in the child table that I can't solve. http://bugs.mysql.com/bug.php?id=14347