Archive for the 'Projects' Category

Apr
17

This short article i have focused on a special field of PHP optimization. It deals with the php.ini.

The php.ini file is the configuration file of PHP. It contains directives and settings, that affect the execution of the PHP script. The possible modifcations, that you can do to increase the speed of execution, are very special. You have to decide which of the settings, that are listed below, seems wise to you and your application. This article is a part of the web-application performance series.

realpath_cache_size & realpath_cache_ttl

PHP uses a realpath-cache to cache the path of include/required-files. This cache was introduced in PHP 5.10 and brought a high improvement of speed into PHP. The default value should be 16K, which is a really good ratio.

register_globals

The default value since PHP 4.2.0 is set to Off, nevertheless you should convince yourself, that this value isn’t  set to On. On the one hand, register_globals are well known as a big security issue and on the other hand PHP creates all variables in $_GET, $_POST, $_COOKIE, $_REQUEST, $_SERVER and $_SESSION in register_globals variables, too. This costs a unnecessary amount of time!

always_populate_raw_post_data

Is always_populate_raw_post_data set to On, PHP fills $HTTP_RAW_POST_DATA with raw data, that comes in with POST. Turn this directive to Off,  if you’re not planning something special with it. It saves time and memory.

register_long_arrays

This directive copies all data into the outdated $HTTP_*_VARS. You can access this data for instance with $_GET. You should turn register_long_arrays to Off to save time and memory.

expose_php

To tell the webserver which version of PHP is installed, PHP sends an information string to the webserver. Set this directive to Off, to avoid an unnecessary string copy. In addition to that, it is safer to tell just the necessary information and not more.

register_argc_argv

If you are using your PHP scripts with a webserver, then you don’t need argc (number of command line arguments) and argv (array that contains the command line arguments). Is register_argc_argv set to On, then PHP tries to parse and copy this variables into the symbol table of the script. This takes unnecessary time. Turn it to Off to save time.

asp_tags

The asp_tags directive points out, that the PHP detects also ASP tags like <% and %>. It’s not best practice to use ASP tags in PHP scripts. If you don’t use these tags, you should turn Off the asp_tags directive to save time.

Apr
16

All methods that i will introduce below can be combined. To perform your static file optimization with “best practice” you have to use all methods. Best practice, in this case, means that you have a minificated file, that was merged (contains more than one file) and was responded with a good compression and the cache headers were set. If you do so, than you have a good understanding of how to optimize static files. This article is a part of the web-application performance series.

Merge

Usually, every web application needs more than one javascript file. Most web2.0  applications need more than 3-4 scripts. In addition to that, they need cascading stylesheets and also more than one. To sum up, you have 7-9 files, that your browser must request from the web server.  Furthermore every request brings a lot of overhead in form of the HTTP header. You can minimize the overhead with merging all the files together. To do so is a bit tricky, because you need a version of your javascript files to continue developing. After the development stage, you have to merge the files to one file. To avoid inconsistencies be aware of something like a versioning number. It is possible to increment a number of your script with every release. For instance, the file main-v003191.js contains the jquery framework, your ajax controller and a form input validator. You can perform this task also with cascading stylesheet similiar to the javacript files. One possible option would be to use the lightscript, that i published on this blog.

Minification

Minification means to remove all comments, whitespaces and carriage returns from your scripts (js/css). You can find obvious improvements on performance, after minifying your scripts. An example file with 120 KB can be reduced to 80 KB. You can use the YUI compressor or minifycss to minify your scripts.

Modify headers

To decrease the load on your web servers set the cache & expires header in a useful way. Set the cache-control header to public and define max-age. To use the Expires header is often called the best practice. Set this header far in the future. Note: If you develop a new version of your scripts, you have to increment something like a version number to avoid cache inconsistency problems. Read more about modifying headers for Lighttpd and Apache2.

Spread hosts

Earlier in the history of the world wide web, web browsers supports just 2 file transfers from the same host. Nowadays, the number of requests from the same host varies from 2 up to 8. To respond all pictures and script as fast as possible you have to serve the picture from more than one host. You can generate sub domains like aa.static.example.com, ab.static.example.com, ac.static.example.com and so on. This idea works very well, if you have a lot of pictures on your website.

Compression

Compressing files is very useful to safe bandwith and to respond faster. On the other hand, compressing files need cpu power and time. It’s just useful to compress file with 10 KB or more. You can control this minimum compression size in nearly all compression modules that web servers offer.  Read more about compression with specific web servers in the web server performance chapter.

Apr
15

Lighttpd (often called Lighty) is a small and popular web server. Some large web applications are using lighttpd to serve their content (youtube, imageshark.us, myspace). The opponent of lighty is apache2 which is process based web server. But the big advantage of lighty is the thread model that makes the handling of the request really slim. This article would give short introduction into speeding up lighty. This article is a part of the web-application performance series.

Maximize the number of Connections

Lighty comes out with maximal 1024 open file descriptors by default. You can modify this in the configuration file lighttpd.conf. For running the server in a production environment with heavy load you have to increase this value. Set it to 2048 or 4096 (depends on your hardware). Be aware, that a simple PHP request needs 3 file descriptors in lighttpd. Now, you can calculate how many file descriptors do you need to serve your load. The 3 file descriptors are used for handle-TCP/IP-with-user, socket-to-FastCGI and Check-whether-a-file-exists.

#number of open file descriptors
server.max-fds = 2048

stat_cache

Lighty has an intelligent stat handling. The stat command is used to get file system information from a specific file. The stat_cache caches this requests to avoid an unnecessary number of stat commands an the associated accesses to the hard drive. Another option to handle something like stat() is FAM (file alternation monitor). FAM comes out with a deamon, that monitors all files. You can enable the stat_cache with the following line in you configuration file:

#enable the simple stat_cache
server.stat-cache-engine = "simple"
#or with a running fam deamon
server.stat-cache-engine  = "fam"

Keep-Alive vs. Close

If you are running a web application with high concurrency you have to consider about keep-alive requests. It’s not good to keep your file descriptors alive and let them idle. When you do so, you waste a lot of resources. It would be a huge performance improvement to set server.max-keep-alive-requests to 0 and avoid unused file descriptors and threads.

server.max-keep-alive-requests = 0

Using XCache

XCache is a opcode cache coded by the lighty labs. It improves the performance of Lighttp+PHP by caching the opcode into the SHM (shared memory segment). So, the interpreted PHP file comes directly from the RAM. This is approx. more than 5 times faster than a simple PHP request on lighttpd.

Important configuration settings are xcache.ttl, xcache.size, xcache.cacher and xcache.optimizer. Xcache.ttl is the time to life for the cached op-code file, before it would be marked as invalid. If this value was set to 0, then the time to life is endless and the file would never be marked as invalid. The xcache.size option points out the maximum allowed size of a cache file (memory mapped file). To turn the cache on or off you can use the xcache.cacher directive with “on” or “off”.

Below you can see an example:

xcache.shm_scheme = 		"mmap"
xcache.mmap_path = 		"/var/cache/xcache.mmap"
xcache.size  = 			64M
xcache.count = 			1
xcache.ttl   = 			0
xcache.slots = 			8K
xcache.gc_interval = 		0
xcache.readonly_protection = 	Off
xcache.cacher = 		Off
xcache.var_gc_interval = 	300
xcache.stat = 			On
xcache.optimizer =		On
xcache.var_count =             	1
xcache.var_size  =            	0M
xcache.var_slots =            	8K
xcache.var_ttl   =             	0
xcache.var_maxttl   =          	0

Mod_compress

This module is used to compress files with gzip, bzip2 and deflate. Compressing files reduces your bandwith and the throughput. Be aware, that any compression needs time and cpu load. To save cpu time, the results of the compression can be cached. To do so, mod_compress needs a few configurations:

compress.allowed-encodings = ("bzip2", "gzip", "deflate")
# you have to create the cache directory! the web server
# is not able to do that for you
compress.cache-dir = "/var/www/cache/"
#add all filetype you need pictures, php files etc.
compress.filetype           = ("text/plain", "text/html")

For more infos check the manual.

Mod_expire

Mod_expire is used to control the cache headers of lighttpd. To cache your static files, this module is all you need. It excepts the following directives:

expire.url = ( "/images/" => "access plus 20 days" )

To learn more about the syntax and how to use this module check out the manual.

Apr
14

This article introduces you to the optimization of the most used web server called Apache2. It gives you just an overview about the optimization of the configuration file and shows you how to install useful modules. Of course, there are some other optimizations with Apache2, for instance the code optimization, but this is not in the scope of this article. This article is a part of the web-application performance series.

Hostname lookups

Is HostNameLookups set to on Apache2 does a hostname lookup for every IP. You don’t need this functionality, because it has no impact on the response and it needs a lot of time. So, set this directive to off.

HostNameLookups Off

Keep-alive

To keep alive a HTTP connection means to let the file descriptors open to handle the next request (from the same user) faster. The idea of keep-alive is cool, but it only makes sense if you have a small amount of users. Otherwise you have open file descriptors that need unnecessary resources. So, it would be better to set this mechanism to Off to serve a high number of requests faster. Modify the configuration file:

KeepAlive Off

Adapt workers

The optimal number of workers (multi-thread & multi-process module) is important for a well working webserver. On the one hand you can have to less workers and on the other hand you can have to much workers. Both cases are not desirable. The problem is, that you have to test the server under a realistic load. You can control the behavior of the Apache2 with the following calculation:

# ServerLimit * ThreadsPerChild = MaxClients
ThreadLimit 50
ServerLimit 30
StartServers 5
MaxClients 1500
MinSpareThreads 30
MaxSpareThreads 50
ThreadsPerChild 50

The setting depends on the power of the hardware. If you have a server that have only one task (to respond HTTP requests), you can increase this settings to make the server more powerful.

Apache benchmark

To test your web server configuration/performance you can use ab (Apache Benchmark) which would be delivered with the Apache2 binary (If not, download it afterwards). You can use this benchmarking tool with:

ab -n 1000 -c 100 http://example.com/
  • n: number of total requests
  • c: number of concurrent requests
  • url: url to test

The tool sends out 1000/100 waves with 100 concurrent requests. Note that you can plot the output with gnuplot.

Mod_expires

Apache2 handles the expires header with mod_expires. The expires header is a HTTP header, that tells the browser, how long the transfered file is valid. You can turn on this module with typing:

 a2enmod expires

into the shell. After that you can configure the web server (httpd.conf):

<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType text/html "access plus 2 hours"
ExpiresByType text/xml  "access plus 2 hours"
ExpiresByType image/jpg "access plus 10 weeks"
ExpiresByType image/gif "access plus 10 weeks"
#add all types of tiles that you need
</IfModule>

The webserver then adds the expires header to the files. It automatically calculates the timestamp after your settings.

Mod_headers

This module is used to append headers to the HTTP response. After enabling mod_headers with

a2enmod headers

you can use the functionality with adding the following line to the configuration file:

Header append Cache-Control "public"

Mod_deflate

This module is used to compress the server output. You have to a2enmod this module and after that you can use it by typing this into the configuration file:

SetOutputFiler DEFLATE
<IfModule mod_deflate.c>
AddOutputFilterByType DEFLATE text/plain text/html text/htm
AddOutputFilterByType DEFLATE application/javascript
#add all types that you need
</IfModule>

Additional

In addition to that, Apache2 has some more modules to improve the performance of the server. Below i listed these modules with the link to the documentation:

Apr
13

This article deals with some PHP code optimizations. The list of tipps & tricks  is not complete at all and i will try to add more tests. I did this tests on windows and linux environments to make safe that the results are rough valid. It’s really important for a good PHP developer that he or she is aware of some performance aspects of the code. The following tipps are a best practice example of coding smart PHP code. This article is a part of the web-application performance series.

Echo vs. print and concatenation

The comparison of the two output options is the oldest of the PHP-scene. It’s wrong to assert that echo ist faster than print because of the procedural overhead that the print instruction produces. These two instructions are both builtin-constructions of the PHP-language and not, as often assumed, print is a function and echo a language-construct. The only difference versus echo and print is, that echo has no return value but print returns always 1 (see: manual). The next Problem is the often unknown difference between outputing a couple of substrings with echo and concatenate substrings to a longer string. The first operation runs with the commata operator and the second one with the point operator. It’s the normal case to use echo with the point-operator, but it’s not best practice.

$str1 = “Test1“;
$str2 = “Test2“;

echo “Hello world, this is “ . $str1 . “  & “ . $str2 . "!";

If you have long sub strings, then your concatenation will take a very long time. Below you can see the best practice example:

$str1 = “Test1“;

$str2 = “Test2“;

echo “Hello world, this is “ , $str1 , “  & “ , $str2,"!";

Quotes

The myth of optimizing quotes is often not welcome in the PHP-community, but the benchmark-tests shows us, that single quotes are a bit faster than double quotes. The reason is, that the parser is looking for variables which could possibly be in there. And this process takes more time, than the simple text parsing. You should use double quotes only in the case, that you want to put variables in the string.

Search strings

We compared strstr() and strpos() with each other. The task is to find a substring in a string. Below you can see the two scenarios:

$sub = '@';

$str = 'mailme@example.com';

//the first way

if(strstr($str, $sub) !== FALSE) echo 'found';

else echo 'not found';

//the second way

if(strpos($str, $sub) !== FALSE) echo 'found';

else echo 'not found';

After testing these two ways in a loop, i noticed that strpos() works faster and uses less memory than strstr().

Comparison of the starting characters

To compare the first n token of two strings, you have 2 options. You can use strncmp() with is the smarter one or you can use substr(). The result should be clear. With just one PHP function call the script performs faster. The best practice scenario is strncmp():

$str1 = 'Teststring1;

$str2 = 'Test_string1';

$len = 4; //compare the first 4 tokens

if(strncmp($str1, $str2, $len) === 0) return true;

else return false;

if(substr($str1,0,$len) == substr($str2,0,$len)) return true;

else return false;

The result of the test shows that strncmp() is faster than the substr() alternative. The advantage of using strncmp() inceases with the length of the string. Another scenario could be, to use strpos(…) like strncmp() to compare the starting characters. I did this test and the result points out that strncmp() was faster than strpos() (but not significant).

Counting operations (count(), sizeof() und strlen())

If you want to know the number of elements in an array to work with it, you can use the sizeof() (or count()) method to get it. It’s often used, but not best practice , to implement these methods in the condition area of loops. It would be better to get the number of elements before the loop starts. Otherwise you count the elements in every single iteration. Note, that count() is faster than sizeof(). The reason is, that sizeof() is just a synonym of count().

//wrong way

for($i = 0; $i < count($arr); $i++){

//magic

}

//right way

$size = count($arr);

for($i = 0; $i < $size; $i++){ //magic }

Read loops and modify loops

Often, you have to modify data in complex structures , such an array. Here you can use the foreach()-construction of PHP. Beware, that the performance of this code depends on the operation that you want to perform. After testing a set of operations (reading, modifying, unset) i noticed, that read-operations perform better with foreach(). The modification of an array works better with the for-loop. The reason is, that PHP does more hash-lookups, if you use it with foreach() and $key => $value. Note, that this loop performs faster if you get $value by reference.

//best practice for reading

foreach($arr as  &$value){

//magic with $value

}

To modify an array you should use something like this.

//best practice for modification

$keys = array_keys($arr);

$c = count($keys);

for($i = 0; $i < $c; $i++){

//magic with $arr[$keys[$i]]

}

Testing array index

To find out whether a array index exists or not you can use the in_array() function or the isset() function.  The in_array() function expects two parameters. The first parameter is the $needle (the index to find) and the second one is $haystack (array). PHP searches into $haystack for $needle. The isset() function ( just expects the variable) simply searches in the symbol table whether this index is set or not.

in_array('test', $array);
isset($array['test']);

I found out that isset() is faster than in_array(). Do not think, that this improvement isn’t significant. The speed improvement increases proportionally with the size of the array.

File handling

You have more than one option to perform file handling in PHP. A simple read from a file can be performed by file() and file_get_contents(). Be aware that the first one returns just an array with all rows. You have to implode this array with a glue (e.g. a whitespace) to use it as a string. The second function returns exactly this string. In most cases you need the content as a string. Use file_get_contents() to do this work. It is more than 200% faster (depends on length). If you need the rows to do your task, then it would be faster to use the file() function.

Replacing strings

If you want to replace strings in PHP you can use str_replace() and preg_replace(). For simple patterns you should use str_replace(). This function is much faster than preg_replace() because it works stupid and simple. But beware,that preg_replace() is much faster, if you have a complex pattern. One call to preg_replace() is smarter than 2 or more calls to str_replace().

Constants & configuration

Defining global constants is a mighty method for large web applications to bring some flexibility into it. To realize this, PHP implements a method called define(). Since PHP supports OOP you can define constants in classes. This method is not only better structured but also faster than the define() alternative. It’s more comfortable to build a configuration class, that contains all constants.

//old way

define('DB_HOST', 'localhost');

define('DB_USER', 'mysql');

define('DB_PW', '**************');

define('DB_DB', 'application');

//other way

class DB{

const HOST = 'localhost';

const USER = 'mysql';

const PW = '**************';

const DB = 'application';

}

Statistical probability (LazyEnd)

Sometimes you have to do some decisions about the program flow of your application. Here you would have a significant increase on performance if you let the parser perform the small functions, before it performs the expensive functions.

if(expensive_function() || small_function()){

//magic

}

//is significant slower than

if(small_function() || expensive_function()){

//magic

}

But why? The parser performs from the left to the right. When the function on the left returns true, then the complete condition becomes true. The parser don’t execute the second function, because it is not necessary. This works for logical AND too. Put the small function or the fastest expression on the left site. In the case of the logical AND becomes the complete condition wrong, if one expression is wrong.

Another statistical optimization aspect is to consider well about the structure of if-else and switch instructions. You should keep the conditions, that are most likely, on top of you if/else or switch construct, to avoid work and save time.

If vs. switch vs. ternary operations

The general opinion about If vs. Switch is, that the If/else construct have a better implementation and therefore it performs faster. I tried to get light into the dark with browsing the code and the internals of PHP. It’s right, that the If/else implementation looks smarter and so i can confirm the general opinion. But there is a construct that works faster than If/else and it’s called ternary operations. Below i listed an example of the simple usage of a tenary operation (a smarter handling of If/else):

return ($condition)? $if_value : $else_value;

Increment operations

To increment an integer variable (e.g. during a loop iteration) you have a few options. For that, PHP provides the pre- and post-increment operator. Here we put the light on the performance of this operators. The fastest method is to use the pre-increment ++$i, because the first step is the increment and after that PHP returns the new value. This is done in approx. 1 cycle. The second-fastest is the post-increment $i++. Here, PHP uses another procedure to handle this. And the third-fastest is the normal assignment, because PHP can add, divide, etc. and with a variable or a number. That makes this operation trickier and even slower for the execution.

Feb
26

Numbers dont lie and here is the proof. Simon Newcomb found out, that the position of the digit in many real-world statistics, numbers and constants obey to a specific law. He published his recognation in a paper called “Note on the Frequency of Use of the Different Digits in Natural Numbers”. Nevertheless Frank Benford was the first, that wrote an entire article about this phenomenon. The article came out in 1938 and was named “The Law of Anomalous Numbers”. Today, analysts and statisticans use Benford’s Law to detect a deception in a set of numbers.

Validity

The law is valid for the most numbers (not for all) in this areas:

  • Stock prices
  • Street adresses
  • Populations (of countries, cities)
  • Mathematical and physical (nature) constants
  • Economic statistics
  • Financial sector
  • Et cetera

Definition

Benford’s law is defined as:

Explaination of Benfords law

Benfords Law | src: wikipedia.org

n … position of d in the number

d … [0-9] as the digit

Implementation

I started with an implementation of Benford’s Law in C. The first algorithm equals the classical Benford, which uses the summation formula.

Benford’s Law using summation formula

During the test of this implementation, i noticed that the computation takes too much time. This algorithm didn’t satisfy me. After some suggestions from Prof. Dr. K. Neumann and some mathematical tricks, i tested the new Benford algorithm which uses the product formula.

Benford’s Law using product formula

This version is much faster then the first implementation. The main reason is, that the product-formula-version uses the logarithm function just with the product. Unlike the product-version, the summation-version uses the logarithm after every step.

Here you can get the source files:

More

For further information on Benford’s Law check mathpages and wolfram.

Dec
14

Hello everyone, today I want to introduce to you the Neuron Based Data Structure (so called NBDS). Recently, I often dealt with the topic of medicine and the similarities to computer science. What can computer sciencists learn from medicine and nature?The answer is: Very much! Well, nearly everything that we have developed was originally inspired by nature. That is why I decided to build a data structure that looks like a neural network. From some points of view, it isn’t really a neural network, but just a network with relationships between the vertexes, anyway I called the vertexes neurons and named the edges axons. But what is the idea for this network structure? In this article, i will explain how to use it.

How it is structured (SNA)

Space

The Space is the master class. It includes all neurons and axons of the network. You can add, edit and remove axons/neurons with methods of the Space class. The Space is the class, which performs your queries to the networks like get, select, route etc. I tend to give the Space the power to capture some statistical data about the network. For example to store routing information.

Neuron

The neuron class is used like a container for some data. It has an unique ID, a name, attributes, a value and an used-counter, which captures the number of times this neuron was affected by queries and so on. This class represents the data in the network. We can say, that this container is some kind of a meta-variable (or object). You can also store data-about-data with the attributes-array. And you can use this attributes-array to store information about the neuron.

Axon

The axon class represents the relationship of the data in the network. You can use an axon to add a logical connection between two neurons. If you connect two neurons with an axon, it shows that the data of these two neurons have a relationship to each other. To explain the relationship, this axon has an attribute called description. It also has an unique ID, a name, a description, a value, an attributes array to specify the attributes of the relationship, stores the unique ID’s of the two neurons that it connects. These two neurons were stored in neuron_from and neuron_to. Another attribute specifies whether this axon works bi-directional or one-directional.

structure of the nbds

Querying the network

With no ability to perform operations on this network, it would be senseless to use it. That is why I thought about some operations that you can execute on the network. But  which operations make sense?

Constants & Definitions

<Type> := NBDS_AXON | NBDS_NEURON | NBDS_ROUTE
<Attr> := NBDS_ID | NBDS_NAME | NBDS_ATTR | NBDS_USED | NBDS_DESC
<Direction> := NBDS_FROM | NBDS_TO
<Options> := NBDS_AXONS | NBDS_NEURONS | NBDS_ALL
<In> := NBDS_IS | NBDS_LIKE | NBDS_BETWEEN | NBDS_IN
<ValueN> := string | numeric
<PatternN> := string | numeric

Get( <Type>, <Attr>, <Value2 = null>)

The Get-operation performs a simple get request on the network. The <Type> parameter in this case should be a constant, for instance NBDS_AXON if you want to get an axon back which exists between neuron <Value1> and neuron <Value2>. As you can see above, I tend to work with many other constants to perform requests.

Select(<Type>, <Attr>, <In>, <Pattern1>, <Pattern2 = null>)

A Select-Command has a bit more complexity. It works like the  SQL-based version of the select statement. In this case we don’t execute our queries on tables – instead we use a network. But stored in an array, this neurons and axons have some similiarities with tables. So I decided to implement a Select-Command that returns a result set. (Get() returns just one object)

Route(<From>, <To>, <Options>)

To route through the network, from point <From> to point <To> you can use the Route-operation. In some cases it is useful to get the information that is linked between some neurons. It’s a bit tricky to perform this task, because routing- & search-algorithms are expensive. After some experiments I have decided to use some kind of a BFS (breadth-first search) algorithm. But why did I choose this one? The experiments I did on testing some algorithms like Flyod-Warshall, Dijkstra’s algorithm,  Bellman-Ford, Breadth-First-Search and Depth-First-Search were time expensive – but course informative. As you can see at this link (from Stony Brook University / Dep. of Computer Science) Breadth-First-Search acts more than a spider. Unlike Dept-First-Search (DFS) it scans the neighborhood of the start-neuron and afterwards goes deeper into each node. I think that BFS makes more sense, because the data that is related to a neuron, must lie in the near of this neuron. So, we must scan all the neighbors of this neuron, before we go deeper into the children of each neigbhour.

Algorithms like Flyod-Warshall, Dijkstra and Bellmann-Ford have to much intelligence to deal with this task in an acceptable amount of time. The BFS-algorithm used in this network works as simple (and smooth) as possible.

You can control the behavior of Route() with some <Options> that i will describe later on.

Examples

Now, i will explain some use cases of the neuron-based-data-structure. For web applications this structure isn’t so suitable, because web servers have to respond fast and secure. In most cases web-based systems keep their logic in the database. This concept works the other way around. In some cases you need a relationship between two data types. For instance you can implement a registry with the singleton-pattern to access objects in all layers of your application. I think it is useful to implement this network structure into such registry. Below, I will list some use cases for this structure (NBDS)  in web-based systems:

  • Intelligence in Registry-structure (singleton-based implementation)
  • Reflect relationships between users with attributes
  • Generate FOF (New Galaxy Group Finding Algorithm) dependencies
  • Reflect relationships to perform requests and get the usage counter to see which neuron is the most important node
  • Put the result of database queries into the network to keep up the relationships between tables or rows
  • Alternative paradigma of generating a page (all neurons that where hit during a Route()-operation must be displayed/responded)

The second part of the example section deals with hardcoded programs or desktop applications. For this kind of applications it is not so tricky to decide whether we must serve with highspeed or not. The reason of this attitude is, that this applications were compiled.  That’s a great benefit for us, because we can use this structure for nearly all data that we need in our program. We can build relations between 2 objects, for instance to display this relations and decorate it with some attributes that we’ve defined before.

  • Generate a network of cities with attributes and and streets between it (also with attributes) (simulate traffic with the Route()-Op.)
  • FOF algorithms
  • To describe connections between data types, for instant just with a pointer

Conclusion

To sum up, I have to say that the idea of connecting 2 variables (or even objects)  isn’t new, but i didn’t find any implemention of this idea.  To explore some opportunities of building a data structure, it was a very helpful process to think about this idea. I have implemented this idea in PHP, because I’m very familiar with this language and it works simply. When I have finished the code and the documentation I will post it on the blog. See the project page of NBDS for more information.

Dec
01

Wordpress. A word that stands for an established system, a nice backend to write and edit articles but also for a frontend which performs slow and heavy. On cip-labs I developed a cache for this system. The cache is very easy to install and does the caching in a very very simple way.

How it works

It works quite simple. I had modified the index.php of the wordpress installation. When a request comes in,  an object of CIP_WP_Cache would be created. This object represents the cache. It looks up in the cache directory and tries to find the file which is requested. The files were stored in a directory defined in CIP_CACHE_DIR. If the file was found in cache (files are named like md5($_SERVER['REQUEST_URI'])), the cache will responds it and then shutdown. The benefit of doing so is, that the wordpress engine would never reached during the whole request. But if the file was not found in the cache, the CIP_WP_Cache captures the output buffer of the wordpress engine and stores the result in it’s cache. For the next requests to this URI, the cache responds the file from the cache.

That’s it, it does nothing more or less. Of course, you need more functionality to control the cache and I will write a plugin for Wordpress which allows you to control the behavior of the cache. In the following I have a few benchmark results from my recent tests:

Benchmark

The results of the benchmark are clear. The heavy wordpress engine (avg. 13 MB ram usage) battles against a static file. I used ab -n 1000 -c http://www.cip-labs.net/ to test the performance of the cache. I think the specifiation of the benchmark environment is regardless for this benchmark.

without cip-cache

  • time taken for requests: 311.653 sec
  • requests per second: 3.21

with cip-cache

  • time taken for requests: 1.897 sec
  • requests per second: 526.88

To sum up this benchmark you can see, that cip-cache serves faster. The cache works with a TTL directive (called CIP_CACHE_TTL), which you can configure in the cip-wp.php file. When cip-cache creates the cache file, it runs through the wordpress engine and after that, it writes the output buffer to the cache directory.

How to Install

At first, you have to overwrite the index.php from your Wordpress installation with the index.php from the source package (copy both files (cip-wp.php and index.php) into the directory which includes the index.php). The second step is to configure the cache with the CIP_CACHE_DIR and CIP_CACHE_TTL directives. Create a directory for the cached files and make sure, that you have set all required permissions (777).

  • CIP_CACHE_DIR: should be the directory that includes the generated cache files
  • CIP_CACHE_TTL:  time to life (in seconds)

That’s it.

Wishlist

  • control the cache behavior with a small Wordpress plugin (clear cache [item or all], set TTL, set cache dir, add/remove routes to control paths to cache)
  • cache just for readers and not for admin’s or users to avoid html fragments that should be not displayed on the readers view

goto: project page

goto: download version 1.0.0

Sep
18

Nearly every web-application needs a professional framework that provides functionality, scalability and useful utilities for creating and extending  good code. At cip-labs I use an inhouse framework for creating web projects. While working on several projects, I extend this lightweight framework with useful features. Now, it’s just in development stage and some features are not ready to post it on the blog, but time after time I will release some parts of the framework to the public.

The first on the blog and latest part in the framework itself is the floraPHP string library. The library contains useful functions (procedural style) to do string manipulation in PHP. PHP delivers some built-in functions, but for intensive string handling it’s not enough.

With this library you can e.g. analyze datatypes of word sof a given string, calcute average sentence and word length, check whether $needle is a word in $haystack, use a more flexible sub string function and so on.

goto: floraPHP string library project page

goto: floraPHP project page

Jul
30

Today, we want to introduce you to trim-me.com. Trim-me is a web-service that shortens your long links to the minimum. At the moment TM is just in a beta-state, but we work on a final version that contains any statistic functions and support for the Internet Explorer.

You can use trim-me to cut your links for twitter, facebook, icq or other services.

Visit: trim-me.com