To content | To menu | To search

XiVO – open-minded telecom systems

Libre software, open hardware: the blog of the XiVO projects

XiVO's new directory server

Some of you may know that a new version of the XiVO client is being worked on at the moment. The new look and feel require some work behind the scenes to merge all kind of contacts a XiVO user might have into a single source of information.

What does all this mean ? A personal contact, an entry from an LDAP server or another XiVO user should all be available to the user as one single list of contacts, even if some operations are not available on all kinds of entries.

We already had a directory xlet that merged XiVO users and remote directory entries into a single list. This xlet is used by the switchboard profile. The directory xlet does most of the heavy lifting of merging the different contact lists on the client side. This solution is not ideal for the future of XiVO as we would like to be able to develop a mobile or web-based version of the client without the burden of rewriting the same logic.

The first step of our work on this new client is to move this directory logic back on the server side. To do so, we are creating a new directory service, named xivo-dird, that will be responsible of handling all queries made to all configured directory sources on a XiVO.

This new service will offer a public REST interface. This means that custom client-side applications will be able to integrate the services provided by xivo-dird easily. We are also making this new service runnable without a complete XiVO ecosystem. It will be possible to install xivo-dird on a dedicated server or in a container. The nature of the work done on xivo-dird will also make it easy to run the service in a distributed manner. With some configuration, an administrator will be able to have many xivo-dird servers running behind a load balancer so that it may be used by many XiVOs simultaneously. For example: Avencall has one xivo per office but could use the same xivo-dird proxy for all offices.

archi-xivo-dird.svg

Architecture

We are also designing xivo-dird for extensibility and we are trying to make plugins as easy as possible to create, making it easier for the community to contribute.

Plugins

Currently planned extension points include:

Backends

Backends are plugins that are used to query directory sources. This is where we find the logic for retrieving data from a specific kind of technology. Backends include, but are not limited to, ldap, csv, xivo-directory (the internal directory of a XiVO), xivo-personal-directory (user's personal contact).

HTTP views

HTTP views are different URLs that are exposed by the xivo-dird server. At the moment we know that we will have a json view that will be used by other XiVO services to retrieve lookup results. Other views will be added to support other needs. Phones are a good example of consumers that require a customized view. Adding support for a new brand of phone to xivo-dird will be a matter of adding the HTTP view plugin that formats the lookup results in a way that the phone understands.

Core

The core of the application is responsible for loading all of the plugins. We will probably use a third party library for this job. We have a proof of concept using stevedore at the moment. Concurrency is also managed by the core.

This kind of architecture will become the reference for other XiVO services. Having modular services that can be executed independently from each other will allow us scale the required parts of XiVO when needed.

You can look at the github repository to view the source code and follow our work. Note that the master branch does not include this work yet. The code in other branches are proof of concepts used to confirm that our architecture could handle the kind of load we were aiming for and that our modular architecture could be achieved but this code is not meant for production and will be replaced once we write the production version.

Sprint review 14.18

Here are a few links explaining what we are going to ship in XiVO 14.18:

Update from XiVO 13.20

It's been a while since we gave any updates, but we've been quite active.

We are currently writing a REST API to configure XiVO, giving access to a simplified set of the Web interface's controls and replacing the current Web services. A few examples of what can be done with this API include listing users, lines, extensions, devices, voicemails, creating users, giving them a SIP line and an extension. The documentation of the API is available online.

The REST API offers one new feature that is not possible via the Web interface: associating multiple users to a single SIP line. The main use case is for multiple users sharing the same physical phone. They can be called using the same or different extensions.

The next step to improve the rest api would be to associate a device to a user, which can currently only be done via the web interface. unfortunately, this requires cleaning and rewriting in python a pretty big bunch of code from the php web interface, mainly because of the handling of programmable function keys, so we are taking the time to do it right.

This development has three consequences:

- First, we are cleaning the storage systems for users, lines, devices, etc., which means changing the database schema and removing useless data caches.

- Second, we are developing a Python interface to configure XiVO, which our REST API uses, and which eventually third-party Python scripts will be able to use, once it is documented.

- Third, we're pushing all configuration events into a software bus (RabbitMQ), so that XiVO components are aware of configuration changes, and eventually third-party programs may be aware of them as well. Again, this will be available once it is documented.

We are also going towards upgrading XiVO to the next version of Debian named Wheezy. The next step is to backport PostgreSQL from Wheezy, so that the database migration, which is not so simple, is not done at the same time as the whole system upgrade.

Finally, we moved all our Git repositories to Github. Some time ago, we moved some of our repositories to Gitorious, which we preferred because it is completely based on free software, but we've had a few problems with it. So we decided to switch to Github. You can now fork us at https://github.com/xivo-pbx.

Understanding the XIOH label on our product

Dear XIOH followers,

As we have passed the CE marking certification, we have finalized the label that we will have on our product with the revelant informations necessary to identify the product once on the market :

XIOH_Label

  • "XIOH - version 5" : the name of the product and the hardware version
  • "S/N : XIOH-5-1236-29" ; the serial number including the hardware version (5), the production batch (1236) and the number in this batch (29)
  • "AC input: 100-240V~, 4-2A, 60-50Hz" : indication on the power requirements used by our ATX 180W power supply
  • "Manufacturer :Avencall" : the name of the manufacturer (that could be different is different production project)
  • "Sources : http://0001-0001.okey.ohanda.org" : the OHANDA trademark is our legal umbrella for this OpenHardware project/product and a direct link to the GIT repository of our hardware and software files is indicated on the label. This will help the customers and users getting information on the hardware, contacting us and sharing information
  • "Made in France" : we are currently producing in France (i.e. locally as we are based here) but we would rather talk about local production to track down the carbone footprint of our product from components sourcing to cabling and packaging
  • Differents logos including the "CE" (1) and RoHS process of production as well as the OHANDA logo.

(1) : http://en.wikipedia.org/wiki/CE_marking

Passing the CE marking certification for EU market

Dear XIOH followers and beta-testers,

We are now delighted to announce that we have successly passed the CE marking certification that implies that we can sell to anyone in the EU community

CE_Marking_ImmunityTests

The CE marking consists of 2 sets of tests :

  • Emission tests according to EN55022 (standard for radiation emission) :
    • EN300386 (V.1.6.1) : Emission tests for Electromagnetic compatibility
    • Measurement of radiated electric field in shielded room (EN55022:2006) for the C Class
    • Measurement of conducted disturbance on the AC main power port - Measurement of conducted disturbance on telecom port / control port 
    • EN61000-3-2 : Limits of harmonic current emission 
    • EN61000-3-3 : Limits of voltage fluctuation and flicker in low voltage supply, for apparatus with current <= 16A
  • Safety tests according to IEC60950-1:2005 and EN60950-1:2006

Continue reading...

Starting beta-test period for XIOH power-users and developers

Dear XIOH followers and developers,

We are now entering our beta-test period with our first batch of XIOH appliances available for shipping for developers and power-users that could be interested in testing and giving feedback on our product. The appliances are currently shipped in a metallic 1U-rackable case for up to 50 users and 10 simultaneous calls. It ships with XiVO 12.24. We're using free MSP430 firmware running the power sequence of voltage levels for the different functional blocks, and a free CPU boot firmware based on Coreboot. Each ROM contains MAC addresses with our OUI prefix, and a serial number. Public design source files are also available.

Here is a glimpse on the look-and-feel of the appliance:

XIOHv5 prototype casing

If anyone is interested, please contact us at xcarcelle_at_avencall_dot_com for availability and potential shipping.

The feedback of our first appliances running in production in 5 different locations are good and we are able to confirm that we can handle up to 1000 calls a day on a single XiVO IOH.

HaPPy 2013 OpenHardware to yall.

A Switchboard for XiVO

We have been working on a new switchboard profile for the XiVO client and this post is an overview of what you might expect in the next few releases.

Continue reading...

Hacking at the Hackfest

Hello once again XiVO followers ! What's this ? My first blog post has barely been published and i've already written a second one ! But this time, i'll be writing about something that will be more interesting for 1337 h@X0r$ and the like: the Hackfest !

Hackfest ? What's a Hackfest ?

The Hackfest is one of the biggest events about computer security in the province of Quebec. This year, more than 400 participants were treated to 2 days full of conferences and hacking games like Lockpicking, Cyber warfare and Capture The Flag.

Hackfest cyberwar

XiVO also participated in the Hackfest by giving a conference and organising the XiVO pwn2own hacking game.

What kind of conference ?

We gave a conference about the security and future of free telecommunications, the slides are available in attachment of this post. The conference was supposed to be given by our colleague Nicolas Bouliane, but unfortunately, he got sick a few days before the event. Me and my SCRUM master gave the talk instead. Here's a picture of us during the conference. As you can see, the room was pretty full !

Hackfest 2012: avencall #hf2012

What about the pwn2own ?

Hackers were given 48 hours to try and hack a standard XiVO server and find the most exploits possible. As the game went on, clues were given out through our twitter feed.

XiVO twitter feed

Once a hacker found an exploit, he could submit it to our scoreboard to win points. At the end of the game, the top 3 teams with the most points won cash prizes. Here's a screenshot of the scoreboard at the end of the game.

XiVO Scoreboard

So what happened ? Did you get hacked ?!

Yes ! As you can see from the scoreboard, 3 teams were able to find quite a few exploits in XiVO. What surprised us most was the number of hacks that were found on the web interface. Since XiVO is first and foremost a telephony system, we thought that the hackers would concentrate on hacking the telephone services (For example, control SIP accounts, create fake telephones through the Provisionning service, DDoS the Asterisk server) After all, the web interface is only used for administrative purposes and isn't a critical piece of the XiVO server. Instead, we got a total of 10 web exploits and only 1 telephone exploit.

What happens now ?

The XiVO dev team is working on fixing all the exploits found during the pwn2own. The fixes will be released in version 12.22 at the end of next week.

All in all, we had great fun participating in the Hackfest. We're already thinking about how we can make the game more exciting next year, like how to encourage people to go explore more of XiVO's telephony services.

Thanks again to the 3 teams who participated in XIVO's pwn2own (RingZer0, Abed&Francis and Bitducks). We look forward to more hacking next year !

A conference with Uncle Bob

Hello to everyone following the XiVO blog ! My name is Gregory Eric Sanderson Turcot Temlett MacDonnell Forbes. I've been working on the XiVO software team for 2 months now and the time has come to publish my first official blog post. Enjoy ;)

At the end of last september, the XiVO team had the amazing chance of attending a talk given by none other than Robert C. Martin himself ! Mr. Martin, also known as "Uncle Bob", is a highly experienced software developer with over 30 years of experience. The subject of his talk was seemingly simple: What should we expect from a professional software developer ? More that you would think. Uncle bob divided his talk into about a dozen different expectations. Here is my personal interpretation for each expectation he gave.

Continue reading...

Astricon 2012

astricon.JPG

For a second year, Avencall was present on the biggest Asterisk conference in the world: Astricon. This year, the conference was held in Georgia, Atlanta.

Like every year, the Monday is the AstriDevCon, a small conference for developers by the developers to discuss the future and how to improve Asterisk. It's a very nice place to get to know the other people that are coding inside Asterisk and also to coordinate with others about what has to be done.

You can find a nice resume of the whole day of discussion here: https://wiki.asterisk.org/wiki/display/AST/AstriDevCon+2012

As you can see, this year we also had a booth (#28). It was a great experience to expose ourself at Astricon because there is a huge flow of people interested and interesting. It was a lot of fun to talk about the XiVO project and the XiVO Open Hardware. All the gizmos we had on our table really attracted may people, especially the raspberry pi that was running XiVO.

conf-xav.jpg

To communicate further about us, we also gave a conference to present Avencall and the XiVO telephony system.

Hello XiVO, Add Your Web Page to XiVO Web Interface

XiVO can be managed using a web interface. This interface is developed in PHP language using a XiVO specific framework.

The idea of this post is to begin to demystify XiVO web interface development.

Setting The Development Environment

First of all, we are going to set up of development environment. The best is to enable nfs mount on your development workstation and to mount the web interface development directory onto your XiVO virtual machine. If your development projects is located in /projects/xivo-skaro, the mount command can be :

 mount -t nfs devipaddress:/projects/xivo-skaro/web-interface/src /usr/share/pf-xivo-web-interface

Now each time you will modify a file within the web-interface directory, you can check your update by refreshing your browser.

XiVO Hello World Page

So let's start by writing a simple web page to display "Hello XiVO world", doesn't remind you something ?

XiVO Actions, Applications and Objects.

webi_src.png XiVO web interface is mainly composed of three types of elements :

  • actions, this is kind of controller, and will route the action to be done (list, create, edit) to the proper application
  • applications, to be considered as the business layer, contains the algorithm to be applied
  • objects, mainly where the persistence takes place.

There are many other components, but let's start first be this simple view.

So if we need to display something, we have to follow these necessary steps :

  • Write a simple PHP page as an action
  • Add this action in the authorized control list
  • Add it to the proper menu
  • Link the menu generated URL to the proper URL
  • Add necessary translations to XiVO translation files

Writing The Page

Let's start with a simple page that we want to be displayed in the menu "services -> IPBX -> Call management", edit a file xivo-skaro/web-interface/src/action/www/service/ipbx/asterisk/call_management/hellowivo.php


<?php
	echo "Hello XiVO world !";
?>

XiVO Authorization Framework

You cannot still browse this new page because this page is not declared within XiVO authorization framework. The list of available pages is located in the file xivo-skaro/web-interface/src/object/objectconf/acl/user.inc. Edit this file and add this new action.

....
$array['tree']['service']['ipbx']['call_management']['pickup'] = true;
$array['tree']['service']['ipbx']['call_management']['schedule'] = true;
$array['tree']['service']['ipbx']['call_management']['cel'] = true;
$array['tree']['service']['ipbx']['call_management']['helloxivo'] = true;
.....

Now you can browse the page https://<your xivo host>/service/ipbx/index.php/call_management/helloxivo.

Create a XiVO user (menu Configuration-> Management -> Users) and check the authorizations for this user (click on the small key icon next to the user) and you will see that a new item appears under call management. webi_acl.png

Translation

You can also note that an error message 'missing translation' is displayed. This is a special marker to be sure that nobody forget to write the translation strings. These translations are located in directory src/i18n were a directory per language can be find.

Edit the file en_US/conf/acl.i18n and add the translation service-ipbx-call_management-helloxivo and to not forget to add the french translation, as in XiVO we always complete the french and english translation.

Now we can display the new page, and can authorize a user to use this new page, still to do is to be able to use this new page using XiVO menu.

XiVO Menu Entries

Edit xivo-skaro/web-interface/src/tpl/www/bloc/menu/left/service/ipbx/asterisk.php, add the menu entry :

........
		if(xivo_user::chk_acl('call_management','cel') === true):
			echo	'<dd id="mn-call_management--cel">',
				$url->href_html($this->bbf('mn_left_callmanagement-cel'),
						'service/ipbx/call_management/cel'),
				'</dd>';
		endif;
		if(xivo_user::chk_acl('call_management','helloxivo') === true):
			echo	'<dd id="mn-call_management--helloxivo">',
				$url->href_html($this->bbf('mn_left_callmanagement-helloxivo'),
						'service/ipbx/call_management/helloxivo'),
				'</dd>';
		endif;
		echo	'</dl>';
	endif;

........

Fix the translation by adding in xivo-skaro/web-interface/src/i18n/en_US/tpl/www/bloc/menu/left/service/ipbx/asterisk.i18n the translation to mn_left_callmanagement-helloxivo

The menu is now correctly displayed, but you still cannot click on it to display your new page. We must now register the menu URL within the XiVO framework.

XiVO URL Routing

Edit xivo-skaro/web-interface/src/object/objectconf/url.inc and add the URL translation

.......
$array['service/ipbx/call_management/schedule'] = 'service/ipbx/index.php/call_management/schedule/';
$array['service/ipbx/call_management/voicemenu'] = 'service/ipbx/index.php/call_management/voicemenu/';
$array['service/ipbx/call_management/cel'] = 'service/ipbx/index.php/call_management/cel/';
$array['service/ipbx/call_management/helloxivo'] = 'service/ipbx/index.php/call_management/helloxivo/';

.......

Now you may click on the new menu entry and display the Hello XiVO world page

webi_hello_menu.png As you may notice this page is not displayed using XiVO look and feel, but that's another story.

Visualizing asterisk deadlocks

It has recently come to our attention that a freeze would sometimes occur in the asterisk application shipped with XiVO.

When the freeze happened, no new calls would be accepted and most of the current calls would freeze. A manual restart of the asterisk process would then be required for the situation to get back to normal.

As you can understand, that's quite an unpleasant situation for a telephony system like XiVO.

So we began investigating on what was causing the freeze, knowing it was probably some deadlocks occuring in the asterisk process. Fortunately for us, asterisk provides some compile time flags that help with debugging such conditions. This is documented on the asterisk wiki.

After recompiling the XiVO version of asterisk with the DEBUG_THREADS and DONT_OPTIMIZE flags, and with the help of some other people, we were able to reproduce the freeze and get some information about the various locks held by the various threads of the frozen asterisk process via the "core show locks" command.

The output of the "core show locks" command looks like this:

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
=== Thread ID: 0xb71ffb70 (tps_processing_function started at [  457] taskprocessor.c ast_taskprocessor_get())
=== ---> Lock #0 (event.c): RDLOCK 1488 handle_event &(&ast_event_subs[event_types[i]])->lock 0x822aa78 (1)
	/usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x812a024]
	/usr/sbin/asterisk(__ast_rwlock_rdlock+0xae) [0x8125263]
	/usr/sbin/asterisk() [0x80ef7a5]
	/usr/sbin/asterisk() [0x8194305]
	/usr/sbin/asterisk() [0x81a5939]
	/lib/i686/cmov/libpthread.so.0(+0x5955) [0xb7324955]
	/lib/i686/cmov/libc.so.6(clone+0x5e) [0xb75361de]
=== ---> Lock #1 (chan_agent.c): MUTEX 421 device_state_cb &(&agents)->lock 0xb64fab48 (1)
	/usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x812a024]
	/usr/sbin/asterisk(__ast_pthread_mutex_trylock+0xae) [0x8123886]
	/usr/lib/asterisk/modules/chan_agent.so(+0x2f8b) [0xb64e7f8b]
	/usr/sbin/asterisk() [0x80ef81d]
	/usr/sbin/asterisk() [0x8194305]
	/usr/sbin/asterisk() [0x81a5939]
	/lib/i686/cmov/libpthread.so.0(+0x5955) [0xb7324955]
	/lib/i686/cmov/libc.so.6(clone+0x5e) [0xb75361de]
=== ---> Waiting for Lock #2 (chan_agent.c): MUTEX 430 device_state_cb &p->lock 0x96a1a60 (1)
	/usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x812a024]
	/usr/sbin/asterisk(__ast_pthread_mutex_lock+0xae) [0x812351e]
	/usr/lib/asterisk/modules/chan_agent.so(+0x2fea) [0xb64e7fea]
	/usr/sbin/asterisk() [0x80ef81d]
	/usr/sbin/asterisk() [0x8194305]
	/usr/sbin/asterisk() [0x81a5939]
	/lib/i686/cmov/libpthread.so.0(+0x5955) [0xb7324955]
	/lib/i686/cmov/libc.so.6(clone+0x5e) [0xb75361de]
=== --- ---> Locked Here: chan_agent.c line 516 (agent_lock_owner)
=== -------------------------------------------------------------------
===
=== Thread ID: 0xb6cffb70 (do_devstate_changes  started at [  726] devicestate.c ast_device_state_engine_init())
=== ---> Lock #0 (astobj2.c): MUTEX 661 internal_ao2_callback c 0x9551498 (1)
	/usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x812a024]
	/usr/sbin/asterisk(__ast_pthread_mutex_lock+0xae) [0x812351e]
	/usr/sbin/asterisk(__ao2_lock+0x48) [0x8087fe4]
	/usr/sbin/asterisk() [0x8088b19]
	/usr/sbin/asterisk(__ao2_callback+0x56) [0x8088fa4]
	/usr/sbin/asterisk(__ao2_find+0x29) [0x80890c0]
	/usr/sbin/asterisk() [0x80afe6c]
	/usr/sbin/asterisk(ast_channel_get_by_name_prefix+0x28) [0x80aff2c]
	/usr/sbin/asterisk(ast_parse_device_state+0x43) [0x80dfc3e]
	/usr/sbin/asterisk() [0x80dff2d]
	/usr/sbin/asterisk() [0x80e03ac]
	/usr/sbin/asterisk() [0x80e0702]
	/usr/sbin/asterisk() [0x81a5939]
	/lib/i686/cmov/libpthread.so.0(+0x5955) [0xb7324955]
	/lib/i686/cmov/libc.so.6(clone+0x5e) [0xb75361de]
=== ---> Waiting for Lock #1 (channel.c): MUTEX 1703 ast_channel_cmp_cb chan 0xae044858 (1)
	/usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x812a024]
	/usr/sbin/asterisk(__ast_pthread_mutex_lock+0xae) [0x812351e]
	/usr/sbin/asterisk(__ao2_lock+0x48) [0x8087fe4]
	/usr/sbin/asterisk() [0x80afa11]
	/usr/sbin/asterisk() [0x8088bdf]
	/usr/sbin/asterisk(__ao2_callback+0x56) [0x8088fa4]
	/usr/sbin/asterisk(__ao2_find+0x29) [0x80890c0]
	/usr/sbin/asterisk() [0x80afe6c]
	/usr/sbin/asterisk(ast_channel_get_by_name_prefix+0x28) [0x80aff2c]
	/usr/sbin/asterisk(ast_parse_device_state+0x43) [0x80dfc3e]
	/usr/sbin/asterisk() [0x80dff2d]
	/usr/sbin/asterisk() [0x80e03ac]
	/usr/sbin/asterisk() [0x80e0702]
	/usr/sbin/asterisk() [0x81a5939]
	/lib/i686/cmov/libpthread.so.0(+0x5955) [0xb7324955]
	/lib/i686/cmov/libc.so.6(clone+0x5e) [0xb75361de]
=== --- ---> Locked Here: channel.c line 3767 (__ast_read)
=== --- ---> Locked Here: chan_agent.c line 515 (agent_lock_owner)
=== -------------------------------------------------------------------
...

and it continues this way for a total of 363 lines.

The big advantage of this representation is that you have a lot of info to help with precise diagnostics and debugging. But there is a downside to this: it is quite hard to quickly get the "big picture" of which thread is waiting for which other thread, etc.

Since it seemed like there were no tools to help us with getting the big picture, we wrote a really simple python script that takes the output of the "core show locks" command and outputs a directed graph in DOT language of the relations between the threads. The generated graph is then fed to graphviz to generate an image like this:

Asterisk Deadlock

Each node represents a thread, labeled with its thread ID, and edges represent a "is waiting for" relation.

From the above image, we clearly see that there's a deadlock between thread 0xaeb2db70 and thread 0xb6cffb70 since both are waiting for each other. We also see a bunch of other threads waiting directly or indirectly on the deadlocked threads, showing the generalized freeze of the asterisk process.

The script, which is named graph_ast_locks.py, can be found in the xivo-tools repository. Given you have the output of a "core show locks" invocation, and that you have graphviz installed, you can then run the following command to generate a graph in svg format:

graph_ast_locks.py core-show-locks.txt | circo -Tsvg -o graph-locks.svg

Test vectors for SMBus Packet Error Checking (PEC) CRC-8

While implementing the SMBus on the MSP430 (see also the post "An engineering story"), I have been looking for SMBus PEC CRC-8 test vectors but could not find any.

A CRC is a Cyclic Redundancy Check. It is a little piece of data typically added at the end of a packet and used to check with an high reliability that no unintended error occurred during transmission (or storage). The math to do the computation and the check of a CRC is not very complicated and can be explained to anybody who knows how to do a long division. The polynomial for the SMBus PEC CRC-8 is x**8+x**2+x**1+1 -- this is a polynomial in GF(2), but you don't really have to understand that part to be able to use CRCs in practice. It corresponds to the binary number 100000111, to be used in a particular way. The following text explains it better than I could: http://www.ross.net/crc/download/cr...

I made my own SMBus PEC CRC-8 test vectors (attached to this post). The format is one test vector per line, like:

TV(616263, 5F)

616263 is the 3-byte message in hexadecimal ("abc" in ASCII). The resulting one byte CRC-8 in hex is 5F.

The test vectors are checked with an official Java applet from smbus.org. They include at the beginning the result for each one byte packet, which is also the table for the fast byte based implementation: CRC = table[CRC ^ byte] (because the initial value to use for CRC is zero), On the MSP430, this implementation should run in something like 9 cycles per byte when dropped in the right place. (xor CRC, Rm /* 3 cycles */; mov table[Rm], CRC; /* 6 cycles */)

An engineering story

To be able to track interesting quality metrics of our upcoming XiVO Office product (XiVO IPBX Open Hardware project), we have decided to add temperature sensors to our current XIOH pcb.

In computers, the typical way to report the temperature to the main operating system is through SMBus. This is suitable in our case: we already have an MSP430 microcontroller that handles the power sequence and is connected to the SMBus of the board. We will connect some diodes to the MSP430 to measure the temperature. So the time has come to make use of the SMBus between the MSP and the EP80579 (our main System On Chip), for temperature measurements and also other purposes.

The MSP430 does not have a full featured SMBus controller, only a generic I2C one. SMBus is a variant of I2C, with additional electrical and timing constraints in the physical layer and definition of the messages at the network layer level.

Although formatting and parsing SMBus messages is easy, properly using the I2C controller of the MSP430 in a multi-master environment is not without pitfalls, even if we did not care about the SMBus timing constraints. To do it with the needed reliability, it is necessary to have a detailed knowledge of the whole system and to take into consideration all kind of interactions on the bus and in the chips. In our business, the reliability wanted by the customers is typically high enough that it makes sense to build robust systems instead of rushing a collapsing sandcastle to market. Plus, in that particular case, we are dealing with the subsystem that brings and keeps the whole board running, and for which the cost to debug in the field is absurdly high.

All complex chips come with various design errors, and the MSP430 is no exception. On the exact version that we use, there are 6 documented errors affecting the I2C controller, of which 4 clearly apply to our board, 1 clearly does not apply, and one required careful system analysis to determine that the preconditions to this erroneous behavior could not happen in our system.

On top of the 4 errata applying to the I2C controller, we have to deal with errata for other parts of the MSP430, plus some detailled aspects that are not errata but are also limiting the way we can make a reliable use of the chip for the tasks we want. Failure to properly take all those details into consideration would lead to eventual faults of various natures, probably including MSP430 crashes impossible to diagnose and leading to spurious shutdowns, systems stuck in the powered state, or any random behavior and degradation of system functions.

The impact could be full-scale, with potential consequences on: availability, maintainability, safety, security, and reliability!

It is worthwhile to note that one of the errata that could have the biggest consequences can only be handled by using one specific software architecture to drive the I2C controller, and that specific software architecture is not the first thing that comes to mind in our preexisting firmware. This is a case where iteration on the design of the I2C code would have meant its complete rewrite.

Complex systems, even moderately so, need a careful design, especially on components that are critical for business or technical reasons. Wishful thinking never produces high reliability and neither does excessive reliance on luck. Modeling, even informally, sometimes pays.

Let us cross rivers or how the dev team learns to bring about major changes to XiVO's architecture, the agile way.

XiVO has been around for over five years now and its use has greatly evolved since then. From small installations of a few users, XiVO evolved to support installations of hundreds of users and more recently major contact centers. We now see XiVO installations with a few thousands of users, multiple agents, queues and contexts and call volumes ever increasing. This growth has brought the XiVO dev team new challenges concerning the architecture of XiVO as not all parts of XiVO were ready for this kind of sollicitation. We knew it. We knew there were some bottlenecks we would need to adress.

We are agile and this means, among other things, that we do not build bridges before we need to cross rivers. Now we're there. We need a bridge! The XiVO dev team has been working hard in the last months to overcome some of the major challenges of XiVO.

One of these challenges is linked with the decentralized/multi-component nature of XiVO. Indeed as you can read in an older post, XiVO is a rich multi-component ecosystem with way too many inter-relations. This has been a work in progress for a while now were the mission is to reduce the number of inter-relations as much as possible by better defining each component's jurisdiction.

We recently had to tackle a very specific challenge where XiVO's use of Asterisk's AMI could disrupt basic telephony (A calls B). For people who don't know, AMI stands for Asterisk Manager Interface and is an Asterisk component allowing custom clients to connect and interact with Asterisk via a socket. In the XiVO ecosystem, the CTI daemon (CTId) is the major consumer of Asterisk's AMI. The CTId is a monothreaded python daemon responsible of handling XiVO client connections. The more users on a given XiVO installation, the more XiVO clients can potentially connect to the CTId and thus, the more traffic the CTId exchanges with the AMI. This traffic can be quite impressive when considering a XiVO installation under heavy telephony load. The CTId's event loop is a synchronous blocking loop and while in this loop, the CTId cannot handle any other jobs. This weakness would not be so terrible if the CTId wasn't doing anything else than handling XiVO clients connections as it would only impact those connections and nothing else. This specific issue is still a major one and we'll adress how we handled it in a later post.

Now if you remember the schema of XiVO's architecture in our previously cited post, you can see an interconnection between the CTId and the AGI. This relation handled mainly the reverse directory lookups, used to display a callerID of incoming call matching a number in a directory. Now this was a major issue as it meant that any calls passed while the CTId was 'blocked' at handling AMI traffic would not go through: A cannot call B anymore!

This was not much of an issue when a typical XiVO installation was populated with a few users as it almost never happened that calls would be blocked because of reverse directory lookups (or happened not often enough for users to even notice and signal the issue). With growing XiVO installations, it became obviously disruptive.

The solution was to remove anything from the CTId that could impact telephony. From this perspective, any calls to the AGI in the CTId where moved to nothing else than the AGId, the daemon responsible for handling communications with Asterisk's AGI. It seems quite obvious when you think of it and one might ask Why oh why was it not already that way? An ever evolving XiVO dev team with five years at developping XiVO, learning a whole lot along the way has to be the only responsible answer.

XiVO is becoming more mature every day and so is its development team, producing a software ecosystem always stronger, more mature and more robust. We love building bridges, bring in the rivers!

Showing off XIOHv5 prototype PCB in its laser cut acrylic case

Please dear XIOH followers, as we approach the production of our final PCB, we have tried the PCB in an orange laser cut acrylic case.

Below you can discover the front view of the casing with our famous analog vintage Socotel S63 phone

XIOHv5 orange case front

You can discover also the back view with all the interfaces of our appliance (2 usb, 3 GbE, 4 ISDN T0, FXS, FXO) :

XIOHv5_orange_case_back

For those who are willing to discover our latest prototype during live-demos, we will be at :

  • Astricon2012 (Atlanta - 23-25/10/2012) : presenting XiVO Software and hosting a booth
  • Hackfest 2012 (http://www.hackfest.ca/) : we will be using our XiVO IPBX OpenHardware during some hacking contest related to VoIP

XIOH prototype version 5: back from factory

Dear XiVO followers and namely hardware fans, this post is intended to give you some feedback from our latest manufacturing of the XIOHv5 prototype leading us to pre-production of the close-to-product ready PCB.

This post will describe some of the main steps we experienced in the team leading us to a fully working prototype.

As a reminder of the previous steps, this PCB (hereafter XIOHv5) is the merge of the 3 PCBs we developped so far (CPU, ISDN, FXOFXS). We assembled them in one main ready-to-go PCB. It was designed using Eagle 6.2. In some months, we are going to release the complete design files as open hardware under the OHANDA trademark.

Its main features are:

  • 10-layers PCB
  • 200 distinct references in the BoM
  • more than 1200 components on the board (sorted as T=Top, B=Bottom, BT=Bottom/Top and H=through-hole)

XIOHV5_COMPONANTS_STOCK

Above is a picture of our lab with some of our prototyping stock of components, ready to go to the factory for the assembly of the first 50 boards...

Continue reading...

XiVO Call Center Reporting Revamp

One of the main feature of XiVO 1.2 is the call center reporting.

Motivation

The first draft of this new feature proved to be hard to maintain and slow to generate statistics. Furthermore, the format of the cache (files containing one month of statistics in JSON format) was not easy to exploit and required more work while viewing daily statistics that would use raw data instead of the pre generated cache for speed reasons.

statistic_queue.png

To fix the problems we had with this first version, we decided to rewrite the cache generation that is not dependent on statistic configuration, avoiding the need to regenerate the cache every time a configuration is changed.

The new cache format uses tables in the asterisk database to store pre-analyzed data. These tables contains call related information (table stat_call_on_queue) and statistic for each hours of statistic (stat_queue_periodic).

This new format make it a lot faster to generate the cache and easier to generate tables and graphics in the web interface. Using an hour as the base time for an entry also fix the problem of overlapping time range that we had to solve with month based cache, where a week could start in month n and end in month n + 1.

Difference from the first version for the user

The cache is only generated for complete hours. This means that if you generate the cache at 12h15 the cache will end at 11h59 and the next time the cache is generated, it will start at 12h00. The cache is also continuously generated (once a day) and since the cache is independent of the configuration, it does not have to be generated each time a change is made to the configuration.

The way to generate the cache manually at this moment is to run xivo-stat fill_db from the command line on the XiVO. The generate cache option of the web interface will run this action when the old cache won't be required anymore.

What is done

Currently (version 12.14) all queue counters are now computed using the new cache. Some errors are still shown on some page that we could not port to the new system in time but most should be fixed in the next version.

We are also working on the first agent counter that should be available in version (12.15) and all other agents counters should be added in the next 2 versions.

Some changes to the configuration are to expected when our work will be done with the counters to reflect the new cache generation, but these changes should be minor and all migration will be handled automatically as usual.

Work methodology

A second aspect of this second version that was not present in the first one is that we are also developing tools to generate calls and check the generated statistics to avoid any regressions in future versions. This process has been more time consuming than developing the counters themselves but is a step to better test automation for XiVO.

Documentation

The documentation for call center statistics is available here.

The code

xivo-stat is available at git://git.xivo.fr/official/xivo-stat.git

xivo-callgen is available at git://git.xivo.fr/official/xivo-callgen.git

XiVO Architecture

The architecture of XiVO is still too complicated. As you can see, the components are virtually all related to each other.

XiVO Architecture

We especially have too many requests from different services to the DB. Since so many components have direct access to the database, there are risks of data corruption. We currently are doing our best to simplify this architecture and to give each component its true purpose. A good example is CONFGEND which was developed to generate the Asterisk configuration by getting the information from the DB. It does one single task and it does it well.

There are many challenges with current XiVO's architecture that we are adressing or will be working on in the near future. Asterisk's AGI should not make requests to the CTID. This is why we removed since XiVO 1.2.11 all AGI interraction in the CTID (notice the link between the CTID and the AGI on the architecture's schema does not exist anymore). In the same vein, the AGID should have read-only access to the DB and the CTID should not interact directly with WebServices.

A bunch of bad practices which we are aware of. We are up to the challenge and integrating this architecture evolution in our 2-weeks iterations makes it all even more exiting.

Preparing the production of our newer prototype (XIOHv5)

The past months had been heavily busy working on the next prototype we develop that aims at assembling our 3 existing boards (ISDN, FXOFXS, CPU) already working, tested and running {coreboot, linux, dahdi, asterisk, xivo 1.2} in our lab. We will make some up-coming about our testing tools to stress out our hardware prototypes in the coming weeks. This work was aiming at optimizing space on our PCB, removing unused testing points and 0R resistor aiming at help us on the previous release of our prototype. As per-se, we use Eagle (6.2) and a bunch of ULP scripts to optimize our development and re-use existing part of the schematics and board.

Below we present a first picture of the PCB with some tests of connectors soldered on it

XIOHv5_PCB_Connectors

Continue reading...

- page 1 of 8