Iterating hash map values, keys without iterator allocations / JAVA

1. The problem

Nobody likes lags. So for some (real time) applications like real time rendered 3D games it is nessessary to avoid memory allocations and garbage collection when beeing e.g. ingame as GC might block your game just when you do not need or want it.

Additionally there is no way to control garbage collection in standard JVMs.

When profiling your application with tools like Visual VM (http://visualvm.java.net) you will notice iterators very fast as one main cause for frequent garbage collection.

While you can work around creating iterators of ArrayList using simple old for loops like


	//
	ArrayList arrayList = new ArrayList();
	arrayList.add("Blues");
	arrayList.add("Rock ann Roll");
	arrayList.add("Swing");

	// avoids creating an iterator
	for (int i = 0; i < 3; i++)
	for (int j = 0; j < arrayList.size(); j++) {
		String value = arrayList.get(j);
		System.out.println(value);
	}

	// will create 3 iterators!
	for (int i = 0; i < 3; i++)
	for (String value : arrayList) {
		System.out.println(value);
	}

you simply can not work around looping over keys or values within a java.util.HashMap.

The following code will always create iterators on heap!


	// create hash map
	HashMap hashMap = new HashMap();
	hashMap.put("1", "Blues");
	hashMap.put("2", "Rock and roll");
	hashMap.put("3", "Swing");

	// iterate values, will create 3 iterators!
	for (int i = 0; i < 3; i++)
	for (String value : hashMap.values()) {
		System.out.println(value);
	}

	// iterate keys, will create 3 iterators!
	for (int i = 0; i < 3; i++)
	for (String key : hashMap.keySet()) {
		System.out.println(key);
	}

	// iterate entries, will create 3 iterators!
	for (int i = 0; i < 3; i++)
	for (Map.Entry entry : hashMap.entrySet()) {
		System.out.println(entry.getKey() + ":" + entry.getValue());
	}

2. The solution

As there is no way to avoid creating iterators in theses cases I just programmed an own unordered hash map which is set up to reuse a single values and key iterator.

There is one drawback as you can not do nesting with the same hash map iterator in it self which is rarely required.

Basically a hash map does some kind of partition based on a hash function (which is provided e.g. in String.hashCode(), ..., Object.hashCode())

Here it is called buckets. Each bucket can contain several key value pairs.
The bucket is identified by the hash function of the key object. The key value pair is finally identified by Object.equals() in a bucket as a bucket can contain multiple key value pairs.

The number of buckets always equals the number of elements stored in a hash map.

In best case you would have N buckets for N elements and each bucket would contain a single key value pair.
In this case the complexity for put, get, remove would be O(1). With collisions (means multiple key value pairs in one bucket) complexity will have a little O(n) shadow.


    package net.drewke.tdme.utils;

    import java.util.ArrayList;
    import java.util.Iterator;
    import java.util.Map;

    /**
     * Hashmap which keeps track of keys and values in a array list
     * @author Andreas Drewke
     *
     * @param 
     * @param 
     */
    public class HashMap<K,V> {

            /**
             * Values Iterator 
             * @author Andreas Drewke
             * @param 
             * @param 
             */
            public static class ValuesIterator<K,V> implements Iterator, Iterable {

                    private HashMap<K,V> hashMap;
                    private int bucketIdx;
                    private int keyValuePairIdx;
                    private int elementIdx;

                    /**
                     * Public constructor
                     * @param hashmap
                     */
                    public ValuesIterator(HashMap<K,V> hashmap) {
                            this.hashMap = hashmap;
                            reset();
                    }

                    /**
                     * Reset
                     * @return this iterator
                     */
                    public Iterator reset() {
                            this.bucketIdx = 0;
                            this.keyValuePairIdx = 0;
                            this.elementIdx = 0;
                            return this;
                    }

                    /*
                     * (non-Javadoc)
                     * @see java.lang.Iterable#iterator()
                     */
                    public Iterator iterator() {
                            return reset();
                    }

                    /*
                     * (non-Javadoc)
                     * @see java.util.Iterator#hasNext()
                     */
                    public boolean hasNext() {
                            return elementIdx < hashMap.elements;
                    }

                    /*
                     * (non-Javadoc)
                     * @see java.util.Iterator#next()
                     */
                    public V next() {
                            while (bucketIdx < hashMap.buckets.size()) {
                                    ArrayList<Pair<K,V>> bucket = hashMap.buckets.get(bucketIdx);
                                    if (keyValuePairIdx == bucket.size()) {
                                            keyValuePairIdx = 0;
                                            bucketIdx++;
                                    } else {
                                            Pair<K,V> keyValuePair = bucket.get(keyValuePairIdx++);
                                            elementIdx++;
                                            return keyValuePair.value;
                                    }
                            }
                            return null;
                    }
            }

            /**
             * Keys Iterator 
             * @author Andreas Drewke
             * @param 
             * @param 
             */
            public static class KeysIterator<K,V> implements Iterator, Iterable {

                    private HashMap<K,V> hashMap;
                    private int bucketIdx;
                    private int keyValuePairIdx;
                    private int elementIdx;

                    /**
                     * Public constructor
                     * @param hashmap
                     */
                    public KeysIterator(HashMap<K,V> hashmap) {
                            this.hashMap = hashmap;
                            reset();
                    }

                    /**
                     * Reset
                     * @return this iterator
                     */
                    public Iterator reset() {
                            this.bucketIdx = 0;
                            this.keyValuePairIdx = 0;
                            this.elementIdx = 0;
                            return this;
                    }

                    /*
                     * (non-Javadoc)
                     * @see java.lang.Iterable#iterator()
                     */
                    public Iterator iterator() {
                            return reset();
                    }

                    /*
                     * (non-Javadoc)
                     * @see java.util.Iterator#hasNext()
                     */
                    public boolean hasNext() {
                            return elementIdx < hashMap.elements;
                    }

                    /*
                     * (non-Javadoc)
                     * @see java.util.Iterator#next()
                     */
                    public K next() {
                            while (bucketIdx < hashMap.buckets.size()) {
                                    ArrayList<Pair<K,V>> bucket = hashMap.buckets.get(bucketIdx);
                                    if (keyValuePairIdx == bucket.size()) {
                                            keyValuePairIdx = 0;
                                            bucketIdx++;
                                    } else {
                                            Pair<K,V> keyValuePair = bucket.get(keyValuePairIdx++);
                                            elementIdx++;
                                            return keyValuePair.key;
                                    }
                            }
                            return null;
                    }
            }

            /**
             * @author Andreas Drewke
             * @param 
             * @param 
             */
            private static class Pair<K,V> {
                    private K key;
                    private V value;
            }

            private int capacity = 256;
            private int elements = 0;
            private ArrayList<ArrayList<Pair<K,V>>> buckets = new ArrayList<ArrayList<Pair<K,V>>>();
            private HashMap.ValuesIterator<K,V> valuesIterator = new ValuesIterator<K,V>(this);
            private HashMap.KeysIterator<K,V> keysIterator = new KeysIterator<K,V>(this);

            /**
             * Public constructor
             */
            public HashMap() {
                    for (int i = 0; i < capacity; i++) {
                            buckets.add(new ArrayList<Pair<K,V>>());
                    }
            }

            /**
             * Clears this hash table
             */
            public void clear() {
                    elements = 0;
                    for (int i = 0; i < capacity; i++) {
                            buckets.get(i).clear();
                    }
            }

            /**
             * Grow hash map
             */
            private void grow() {
                    // store old buckets
                    ArrayList<ArrayList<Pair<K,V>>> oldBuckets = buckets;

                    // create new buckets
                    buckets = new ArrayList<ArrayList<Pair<K,V>>>();
                    capacity*= 2;
                    elements = 0;
                    for (int i = 0; i < capacity; i++) {
                            buckets.add(new ArrayList<Pair<K,V>>());
                    }

                    // recreate bucket table
                    for (int i = 0; i < oldBuckets.size(); i++) {
                            ArrayList<Pair<K,V>> bucket = oldBuckets.get(i);
                            for (int j = 0; j < bucket.size(); j++) {
                                    Pair<K,V> keyValuePair = bucket.get(j);
                                    put(keyValuePair.key, keyValuePair.value);
                            }
                    }
            }

            /**
             * Get the associated value of given object / key
             */
            public V get(K key) {
                    ArrayList<Pair<K,V>> bucket = buckets.get(Math.abs(key.hashCode()) % capacity);
                    for (int i = 0; i < bucket.size(); i++) {
                            Pair<K,V> keyValuePair = bucket.get(i);
                            if (keyValuePair.key.equals(key)) return keyValuePair.value;  
                    }
                    return null;
            }

            /**
             * Removes associated value of given object / key
             */
            public V remove(K key) {
                    ArrayList<Pair<K,V>> bucket = buckets.get(Math.abs(key.hashCode()) % capacity);
                    for (int i = 0; i < bucket.size(); i++) {
                            Pair<K,V> keyValuePair = bucket.get(i);
                            if (keyValuePair.key.equals(key)) {
                                    bucket.remove(i);
                                    elements--;
                                    return keyValuePair.value;  
                            }
                    }
                    return null;
            }

            /**
             * Put given value with associated key into this hash map
             */
            public V put(K key, V value) {
                    if (elements + 1 >= buckets.size()) grow();
                    V oldValue = remove(key);
                    ArrayList<Pair<K,V>> bucket = buckets.get(Math.abs(key.hashCode()) % capacity);
                    Pair<K,V> keyValuePair = new Pair<K,V>();
                    keyValuePair.key = key;
                    keyValuePair.value = value;
                    bucket.add(keyValuePair);
                    elements++;
                    return oldValue;
            }

            /**
             * @return number of elements
             */
            public int size() {
                    return elements;
            }

            /**
             * @return Values Iterator
             */
            public ValuesIterator<K,V> getValuesIterator() {
                    valuesIterator.reset();
                    return valuesIterator;
            }

            /**
             * @return Keys Iterator
             */
            public KeysIterator<K,V> getKeysIterator() {
                    keysIterator.reset();
                    return keysIterator;
            }

            /*
             * (non-Javadoc)
             * @see java.lang.Object#toString()
             */
            public String toString() {
                    String string = new String();
                    for (int i = 0; i < buckets.size(); i++) {
                            ArrayList<Pair<K,V>> bucket = buckets.get(i);
                            for (int j = 0; j < bucket.size(); j++) {
                                    Pair<K,V> keyValuePair = bucket.get(j);
                                    if (string.length() > 0) string+=", ";
                                    string+= keyValuePair.key + "=" + keyValuePair.value;
                            }
                    }
                    string ="HashMap[" + string + "]";
                    return string;
            }

    }

With the own hash map implementation you can iterate over values and keys while allocating no new iterators as they are reused:


	// create hash map
	HashMap hashMap = new HashMap();
	hashMap.put("1", "Blues");
	hashMap.put("2", "Rock and roll");
	hashMap.put("3", "Swing");

	// iterate values, will reuse iterator
	for (int i = 0; i < 3; i++)
	for (String value : hashMap.getValuesIterator()) {
		System.out.println(value);
	}

	// iterate over keys, will reuse iterator
	for (int i = 0; i < 3; i++)
	for (String key : hashMap.getKeysIterator()) {
		System.out.println(key);
	}

Bits and Bytes

1. Bit representation in ordinal data types

Every bit has corresponding value in a ordinal data type

BIT_0 -> 1
BIT_1 -> 2
BIT_2 -> 4
BIT_3 -> 8
BIT_4 -> 16
BIT_5 -> 32
BIT_6 -> 64
BIT_7 -> 128

BIT_n -> 2^n

2. How to check if a bit is set

bitN = (value & BIT_N) == BIT_N

3. How to set a bit

value|= BIT_N

4. How to unset/toggle a bit

value~= BIT_N

5. Most/least significant bit

MSB                   LSB
7 6 5 4 3 2 1 0

the most significant bit in a byte is BIT_7 with a value of 128
the least significant bit in a byte is BIT_0 with a value of 1

6. Shifting

int<<=1

will multiply a int by 2 by moving the bits 1 bit left in direction to the most significant bit

int>>=1

will divide a int by 2 by moving the bits 1 bit to right in direction to the least significant bit

7. What else can you do

mask more than one bit

byte = int & 0xFF

will return only the first 8 bit (sum of BIT_0 … BIT_7 = 0xFF) from an integer

combined mask and shifting

byte0ofint = int & 0xFF
byte1ofint = (int & (0xFF << 8)) >> 8
byte2ofint = (int & (0xFF << 16)) >> 16
byte3ofint = (int & (0xFF << 24)) >> 24

8. Byte order

each cpu stores ordinal data types bigger than one byte in a specific byte order

intel: little endian byte order begins with LSB byte
powerpc: big endian byte order begins with MSB byte
arm: can do both as far as i know

so whenever you store and read ordinal data types bigger than a byte binary you should care about byte order

9. Why?

  • you need most likely work with bits and bytes when doing
    • hardware related programming
    • network programming
    • other stuff
      • e.g. you can implement sets using bits

Injections

In last time i saw more often security holes in web applications from web developers that actually do create web applications and run them – SQL injections.

I do not understand that, because avoiding them is actually no voodoo.

1. Understanding SQL injections

Its easy. You instruct a mysql server to do things with SQL commands. At the application level these are normal strings consisting of a human readable syntax.
So this is actually where exactly the injection can happen.

Usually you will not write harmful SQL code on your own, so where do they really come from?

This query does not look harmful:

$sql = ‚SELECT `id`, `name` FROM `accounts`‘;

This query does:

$sql = ‚SELECT `id`, `name` FROM `accounts` WHERE `id` = ‚ . $id;

Just depending where your $id does come from.

Consider the following code:

$aid = $_GET[‚aid‘]; // read account id from GET parameter
$sql = ‚SELECT `id`, `name` FROM `accounts` WHERE `id` = ‚ . $aid;
mysql_query($sql);

This can lead to a SQL injection, as I could do the following

curl „http://yourwebapplicationhost/account.php?id=0%3B%20DROP%20TABLE%20%60accounts%60%3B“

The above code just means I can directly manipulate the SQL command.
I just need to end the SQL „SELECT“ and add a SQL „DROP TABLE `accounts`;“

Its the same like:

$aid = ‚0; DROP TABLE `accounts`‘;
$sql = ‚SELECT `id`, `name` FROM `accounts` WHERE `id` = ‚ . $aid;
mysql_query($sql);

I could give more examples. But the idea should be clear now.
No. Magic quotes is not a fix. Its rather a mess and deprecated. Just qoogle that.

2. How to prevent SQL injections

2.1 Escaping

A key to success could be escaping. In easy words escaping means that „user input“ data will be prepared to be safely used in a SQL query.
If tied to old functional mysql API you just need to do it with mysql_real_escape_string(). If using PDO just have a look at: PDO::quote()

So our code could be fixed like:

$aid = $_GET[‚aid‘]; // read account id from GET parameter
$sql = ‚SELECT `id`, `name` FROM `accounts` WHERE `id` = ‚ . mysql_real_escape_string($aid);
mysql_query($sql);

2.2 Prepared Statements

A better way to prevent SQL injections is to use PDO with prepared statements.
Prepared statements have advantages and disadvantes too.

2.2.1 Advantages:

Prepared statements are SQL commands without the actual parameters. They are more like templates for queries of the same type. They become compiled in the SQL server and can be reused which gives a performance gain as compiling is only required one time.

I tested it once and if i remember right, my application speed increased by 20%
Unfortunatly web application requests have a very short life time so that reusing them does not often makes that much sense.

Well, as the parameters are not included in the SQL command you just can not tweak the SQL command but only its parameters.

2.2.2 Disadvantages:

You need one more request to the database server to set up the prepared statement.
You cannot use parameter binding in SQL commands like:

SELECT `id`,`name` FROM `accounts` WHERE `id` IN(:id1, :id2)

You would have to use PDO::quote() again and construct your SQL like:

// set up ids
$ids = array(1,2);
// quote ids
foreach($ids as &$id) $id = PDO::quote($id);
// construct SQL
$sql = ‚SELECT `id`,`name` FROM `accounts` WHERE `id` IN(‚ . implode(‚,‘, $ids) . ‚)‘;

3. Conclusion:

You should always be alerted if putting input data not constructed directly from or not validated of your application itself from like $_GET, $_POST, $_REQUEST, … into e.g. a SQL query and thus escape it.
Its even better to have a abstract security concept which might be e.g. using prepared statements.

4. XSS/HTML injections

Injections mean injecting something! So injections are all similar.
A simple example of a XSS or HTML injection would be:

$userName = $_GET[‚username‘];
echo ‚Hallo ‚ . $userName;

So with this kind of script i can easily put HTML or Javascript on your page just by putting some HTML or Javascript into a GET parameter „username“ when calling the script.

You might want to have a look at strip_tags() or using certificates.

I think, i dont need to explain that further.

5. Code injections

Its the same with code injections. The difference is just where you put the injected data. With PHP code injections  a candidate e.g. is:

$code = $_GET[‚code‘];
eval($_GET[‚code‘]);

6. So what?

So this article may not be complete but it should at least make up an idea about this topic in your mind.

Please be aware that this article is about injections related to PHP, but they work the same way in other languages as well like javascript, java, …