Open source database highly optimized for Web applications

한국어 Login Register

Best Way to Store IP Address in CUBRID

Overview of IP address

There are various ways to store IP addresses in a database. All are valid, but we are interested in the most optimized solution both in terms of less memory allocation and higher performance of operations made on the stored values.

So, basically an IP address (we will talk about the IPv4 - Internet Protocol version 4) is a four dotted octet string which uses 32-bit (four-byte) addresses (min: 0.0.0.0, max: 255.255.255.255). It limits the address space to 4,294,967,296 (232) possible unique addresses.

Overall, there are several ways to represent an IP address.

Notation Value Conversion from dot-decimal
Dot-decimal notation 192.0.2.235 N/A
Dotted Hexadecimal 0xC0.0x00.0x02.0xEB Each octet is individually converted to hexadecimal form
Dotted Octal 0300.0000.0002.0353 Each octet is individually converted into octal
Hexadecimal 0xC00002EB or C0.00.02.EB Concatenation of the octets from the dotted hexadecimal
Decimal 3221226219 The 32-bit number expressed in decimal
Octal 030000001353 The 32-bit number expressed in octal

Best data type for IP address

The Dot-decimal notation is the most commonly used representation, but in order to store it in a database as it is, it requires to define a CHAR (otherwise called CHARACTER) type column (VARCHAR and other character data types would also be fine). It means 15 bytes are needed to store all the 15 characters of an IP address and searching among the stored values will take significantly more time if stored in numbers.

The ideal data type to store IPv4 address would be INT(10) UNSIGNED which would take 4 bytes only. Length 10 is set, because the largest IP address in the form of a number is 4,294,967,296 which can have 10 digits maximum. UNSIGNED because it cannot be less than 0 anyway.

However CUBRID (and other database systems like Oracle, SQL Server, and PostgreSQL) does not support the UNSIGNED type for numeric data types, because it follows the '92 ANSI Standard SQL, which does not know UNSIGNED. But this is not an issue, because we can likewise store it in just INT(10). Technically, there is no difference how it is stored. You will see below.

The following is a PHP code which illustrates how to convert an IP to INTEGER and store it. PHP provides a built-in function ip2long which converts a dot-decimal IP address (e.g. 192.168.121.215) to an INTEGER. Similar results can be obtained in other programming languages, too.

$sql = 'INSERT INTO ipTableInt VALUES (' . ip2long($ip) ')';

echo sprintf("%u", ip2long($ip));	// Output: 3232266711. This is how an IP address should look like in UNSGINED INT, but...
echo ip2long($ip);					// Output: -1062700585. This is how it will be stored actually in the database.

As you can see above, when inserted, the value of the stored IP address is -1062700585. But as long as we do not care how it looks like (whethere it is readable or not), this is the most efficient way to store an IP address in CUBRID and other database systems.

To get the real Dot-decimal representation of the IP address back:

$sql = 'SELECT ip FROM ipTableInt WEHRE ip = ' . ip2long($ip);
$result = cubrid_execute($connection, $sql);

if ($result){
	$row = cubrid_fetch($result);

	echo $row[0];			//	-1062700585			This is how it is stored.
	echo long2ip($row[0]);	//	192.168.121.215		PHP will convert it back correctly.
}

As you can see, PHP does the work for you, i.e. it converts the negative value integer with long2ip function to the correct dot-decimal form. Thus, storing an IP address in signed INT(10):

  • Takes least memory space (4 bytes).
  • Sorting and indexing numeric values is much faster for a database server than if stored in characters.
  • Thus, searching within the numeric ranges is much faster than by character/byte comparison for strings.

Whatever data type you choose, once stored, do not forget to index the column to have faster lookup performance.

Disadvantage of INT(10)

Like was mentioned before, there is only one disadvantage: the value stored in the database is unreadable unless you use VIEW or programmatically convert to readable format as illustrated in the above code.

Other ways to store an IP address

BINGINT

Size: 8 bytes
Range: from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

You can also store an IP address in BIGINT (other syntax BIG).

CREATE TABLE "iptablechar"(
	"ip" bigint
);

For instance, to save a 192.168.121.215 dot-decimal IP address in a 8-byte BIGINT type, just insert it as it is:

$sql = 'INSERT INTO ipTableInt VALUES (' . sprintf("%u", ip2long($ip)) . ')';

As you see, we have used sprintf function to return formatted string. In the example of INT(10) we have explained that ip2long() will be automatically converted into 32-bit signed integer which can overflow and display negative values. To display the decimal form of an IP address as it is in the form of an unsigned integer we need to format it.

To retrieve back:

$sql = 'SELECT ip FROM ipTableInt WEHRE ip = ' . sprintf("%u", ip2long($ip));
$result = cubrid_execute($connection, $sql);

if ($result){
	$row = cubrid_fetch($result);

	echo $row[0];				//	3232266711		This is how it is stored.
	echo long2ip($row[0]);		//	192.168.121.215
}

CHAR(4)

Size: 4 bytes
Range: from 0 to 255

You can also store an IP address in CHAR(4) (other syntax CHARACTER(4)).

CREATE TABLE "iptablechar"(
	"ip" character(4)
);

For instance, to save a 192.168.121.215 dot-decimal IP address in a 4-byte CHAR type, separate each octet (you will get four ASCII values) and convert them to their character representation using CUBRID's built-in function CHR():

/* split ip up into its octets*/
list($octet1, $octet2, $octet3, $octet4) = explode('.', $ip);

$sql = 'INSERT INTO ipTableInt VALUES (CHR(' . $octet1 . ') || CHR(' . $octet2 . ') || CHR(' . $octet3 . ') || CHR(' . $octet4 . '))';

To retrieve back:

$sql = 'SELECT ip FROM ipTableInt WEHRE ip = CHR(' . $octet1 . ') || CHR(' . $octet2 . ') || CHR(' . $octet3 . ') || CHR(' . $octet4 . ')';
$result = cubrid_execute($connection, $sql);

if ($result){
	$row = cubrid_fetch($result);

	echo $row[0];			//	À¨y×				This is how it is stored.
	echo ord($row[0][0]) . '.' . ord($row[0][2]) . '.' . ord($row[0][3]) . '.' . ord($row[0][4]);
	//	Output: 192.168.121.215
}

BIT VARYING(32)

Size: 4 bytes
Range: from 0x00000000 to 0xFFFFFFFF

You can also store an IP address in a binary format BIT VARYING(32). In CUBRID you can INSERT BIT VARYING data type values in both binary format, prepending 0b or B'', and hexadecimal format, prepending 0x or X''.

CREATE TABLE "iptablechar"(
	"ip" bit varying(32)
);

For instance, to save a 192.168.121.215 dot-decimal IP address in a 4-byte BIT VARYING type, convert it to hexadecimal form using PHP's built-in dechex() function:

$sql = "INSERT INTO ipTableInt VALUES (X'" . dechex(ip2long($ip)) . "')";

To retrieve back:

$sql = "SELECT ip FROM ipTableInt WEHRE ip = X'" . dechex(ip2long($ip)) . "'";
$result = cubrid_execute($connection, $sql);

if ($result){
	$row = cubrid_fetch($result);

	echo $row[0];					//	C0A879D7	This is how it is stored.
	echo long2ip(hexdec($row[0]));	//	Output: 192.168.121.215
}

Conclusion

For now as we consider only IPv4 addresses, probably the best data type to store them in CUBRID is still INTEGER, though it is not going to be readable. But for most computational works with an IP address, we probably do not need to view it. We will create a guide for IPv6 as it gets more widely deployed.

Now as we know how to store an IP address in a CUBRID database, we probably would want to try it in the real world. For this reason the next guide will cover how to efficiently implement ip2country, a well known solution for determining the country name based on a user IP address.




You are either using a very old browser or a browser that is not supported.
In order to browse cubrid.org you need to have one of the following browsers:



Internet Explorer: Mozilla Firefox: Google Chrome: