Adi Drumea

Wednesday, September 02, 2009

[.NET] String.IndexOf

From MSDN: This method performs a word (case-sensitive and culture-sensitive) search using the current culture.

I knew it was case-sensitive, but I had to solve a bug to discover it was also culture-sensitive. This lead to some annoying stuff like searching for string "Țară" ("country" in Romanian) in string "Prima oară" ("first time" in Romanian) would return index = 7. Because the default culture was en-US, IndexOf looked at "Ț" as empty string, so it would match just "ară". Weird default behavior if you ask me.

Anyway, they have decided to change it for .NET 4.0: "Please note that on Silverlight, and starting in .NET Framework 4.0, this method performs an ordinal comparison instead of a culture-sensitive comparison using the current culture (CultureInfo.CurrentCulture). This will result in this method having a different behavior on these platforms. Instead, it is advised that the String.IndexOf(String, StringComparison) overload be used on all platforms to minimize the impact to your existing applications."

Monday, May 05, 2008

[Apache] Error 300 Multiple Choices

I've spent the last two hours trying to figure out the cause of this message. First I thought it had something to do with mod_rewrite, then I suspected mod_negotiate (didn't know what it does). To make things worse, searching the error message would hit error pages crawled by google, nothing helpful. Finally, I used some magic keywords (url closest match) and I found the guilty module: mod_speling. Once enabled, it matches the url to the closest valid name it can find.

I can't help notice the humour of the open source programmers, spelling 'spelling' as 'speling'. Not enjoing it very much, they'd better make a safer default apache config than try so hard to look smart. Typical "rtfm!/i'm so smart" counterproductive behavior.

Tuesday, April 15, 2008

[C++] Sending e-mail without SMTP

Today I needed to write some code in a .net application to allow the user to send e-mails with certain files attached to an arbitrary e-mail addresses. The application does not have to depend on Outlook or other e-mail clients being installed on the machine.

I thought about the problem a bit and realized it can be done via DNS queries and sending the mail directly to the domain determined from the recipient's e-mail address (like any MTA does). Fortunately, someone beat me into writing a very simple solution based on ATL.

Here's the link on codeproject:

http://www.codeproject.com/KB/tips/CSMTPConnection2.aspx

I downloaded the source code, basically a derived class overloading CSMTPConnection::Connect to enumerate SMTPs of the given domain, but it did not work. I have Visual Studio 2008 and ATL is no longer included in it. So I downloaded the ATL sources from Microsoft's codeplex:

http://www.codeplex.com/AtlServer

Next, I hit a bug in GetRecipientsString() from atlmime.h, which I have fixed. I then searched the issues on codeplex and found out it was there already:

http://www.codeplex.com/AtlServer/WorkItem/View.aspx?WorkItemId=4683

All I was left to do was to assemble the code in a Managed C++ library and wrap the functionality in a friendly class and my task was done.

Wednesday, March 12, 2008

[Windows] How to obtain meaningful application dumps from your clients

Step 1. Have them download and install the userdump package from Microsoft:

http://www.microsoft.com/downloads/details.aspx?FamilyID=E089CA41-6A87-40C8-BF69-28AC08570B7E&displaylang=en&displaylang=en

Step 2. In the userdump configuration, have them add your application.exe to the list of monitored processes

Step 3. Have them run your application and when the error strikes a dump will be saved to the windows directory

Step 4. Have them browse the windows directory, order by modified and send you the last .dmp file

Step 5. Analyze the dump file with WinDBG from Microsoft:

http://www.microsoft.com/whdc/devtools/debugging/installx86.mspx

Monday, March 10, 2008

[HTML] Internet Explorer ignores TD width attribute when TD with colspan and large content is present

Let's take a HTML table with fix width and two fixed width columns:

<table width="500" cellpadding="0" cellspacing="0"><tr><td width="300">a</td><td width="200">b</td></tr></table>

This will render in IE and Firefox as it is supposed to: one col 300 px and the other col 200 px.
If you add another row with one cell with colspan = 2, and the content is wide, like:

<tr><td colspan="2"><div style="'width:490px;'">aaaa</div</td></tr>

..this works just fine in Firefox, but in IE it may break the widths of the other two cols, although it fits just fine in the table width. Very weird.

The fix is to use percentage width on the first column and to not specify a width for the second column. In this case, you would change the first row to:

<tr><td width="60%">a</td><td>b</td></tr>

This will behave the same in IE and Firefox.

[MySQL, PHP] Using UTF-8 correctly -- updated

Here is a simple recipe for people who need to work with UTF-8 only in PHP and MySQL:

In my.ini set the params below and restart the service:

[mysql]
default-character-set=utf8
[mysqld]
default-character-set=utf8

In php.ini set the params below and restart the web server.

default_charset = "utf-8"

When using mysql/i to make a connection from PHP, execute the following after connecting:

mysqli_set_charset($db, 'utf8');

Use mb_xxx functions for string functions in PHP. See this page for reference:

http://www.phpwact.org/php/i18n/charsets

Here is a summary of the functions calls you need to replace in PHP (incomplete):

mail() --> mb_send_mail()
strlen() --> mb_strlen()
strpos() --> mb_strpos()
stripos() --> mb_stripos()
strrpos() --> mb_strrpos()
substr() --> mb_substr()
strtolower() --> mb_strtolower()
strtoupper() --> mb_strtoupper()
ereg*() --> preg*() (In general preg* is recommended)
preg_* --> preg_* /ui (make sure you add /u switch for safety)
sprintf --> ??? (uncertain, probably sprintf works out of the box)
s[i] --> mb_substr(s, i, 1) (indexing gets byte, not char, at index)
strstr() --> mb_strstr()
stristr() --> mb_stristr()
split() --> mb_split()

For preg_* functions make sure character classes like [a-z] becomes [\pLl] and [A-Z] becomes [\pLu]

Also:

html_special_chars seems to work just fine. Not sure about addslashes. In general you should audit the correct usage of every string function throughout your application.

Bonus:

1. In MySQL, to sort correctly by a utf8 column, specify collation after order by (or specify collation on that column when you create the table). For example to order by a utf8 column with romanian collation:

select * from table order by column collate utf8_romanian_ci desc

This is meant as a simple guide for the situation when you use only utf8 throughout your database and your application, which should be fine for most languages.

2. I have not managed to make DBManager show UTF-8 characters in the results grid, so you will see garbage although the data is stored correctly. The same applies to the mysql command line client (on Windows).

3. There may be some problems with inserting blobs if the default charset is utf8, I am currently looking at this and update this.

Monday, February 04, 2008

[SQL Server] Copy data between rows of the same table

How do you solve the problem of copying data from the same column between two existing rows of the same table, in Microsoft SQL Server 2000?

The answer is UPDATE ... FROM:

UPDATE tbl SET val = b.val
FROM tbl, tbl b
WHERE tbl.primaryKey=pk1 AND b.primaryKey=pk2


The syntax may seem a bit weird... but it works.

Tuesday, November 27, 2007

[Javascript] typeof

Today i needed a way to check the real type of an object in javascript. Since typeof always returns 'object' for Date and Array objects, i needed a typeof routine that would return 'array' or 'date' (needed for json encoding). Here is a simple method to detect if an object is an Array instance or a Date instance:

function objectIsArray(o) {
if (o === null)
return false;
return o.constructor == (new Array).constructor;
}

The constructor function is always defined for arrays. Two functions return equal if and only if they're the same function.