Im Rahmen eines Beratungsprojektes beschäftigen wir uns mit Kollegen der osAlliance und Softcom derzeit mit Chat im Enterprise Umfeld. Schon im letzten Jahrtausend hatten wir mit MUDs und IRC gearbeitet, dem Chat der ersten Generation. Nun haben wir 30 Chat Clients genauer unter die Lupe genommen.

Die zweite Generation von Chat Anwendungen war mobil geprägt. Es kamen "@mentions" oder Audio/Video Verbindungen dazu.

Slack ist das WhatsApp der Software Entwickler und löste 2015 einen neuen Entwicklungsschub für Chatlösungen der 3. Generation aus. Dem Hype folgten Open Source Plattformen wie Rocket.Chat, welches wir nun in einem Feldversuch für die öffentliche Verwaltung testen. Im Prinzip wird jetzt die Ursprungs-Idee von Wave weiter entwickelt, also Leute zu einzuladen, auf einer gemeinsamen "Welle" zu kommunizieren. Das von Google entwickelte Wave Projekt ist jedoch aufgrund seiner Komplexität eingestellt worden.

Es geht heute mehr denn je um schnelle, informelle, synchrone Kommunikation sowohl am Desktop als auch mobil - und zwar in Gruppen und zwischen Personen. Da sind E-Mail und Foren oftmals zu behäbig und agile Softwareentwicklungstools die kanban oder scrum anbieten zu sehr strukturiert. Da wird vermutlich auch noch einiges zusammenwachsen.

Die fesche Open Source Lösung Rocket.Chat basiert leider nicht auf dem xmpp Standard, stellt dafür jedoch eine Bridge zur Verfügung. Die Audio/Video Verbindung klappt trotz STUN/TURN Server noch nur zufällig, da erwarten wir uns noch einige Entwicklungsschritte - und die werden bei Release - Zyklen von weniegr als zwei Wochen kommen.

Eine mit xmpp standardisierte Lösung würde auch verteilt funktionieren ("federated"). Per xmpp kann man mit jede/m über eine Jabber ID (JID = vorname.nachname [at] chatserver.tld) Kontakt aufnehmen bzw zum Chat einladen. Eine Umsetzung erfordert allerdings zahlreiche Tests auf diversen Endgeräten, bei der Unterstützung von Blackberry oder Windows Phone wird es dünn, dazu müssten eigene Clients entwickelt werden. Herausforderungen sind dabei das Nachsenden von Nachrichten auf alle Devices, auch wenn sie dazwischen offline waren (dazu gibt es etwa den akzeptierten Standard MAM - XEP-313), Audio/Video Verbindungen über Jingle oder WebRTC (geht gemischt wohl nur mit jitsi Videobridge) oder das unterschiedliche Handling von Chaträumen und Einladungen von Mitgliedern dazu.

Dies ist die nach vielen Kriterien und Tests erstellte Shortlist der geeignetsten xmpp Clients sind:

  • Android: Conversations
  • iOS: Monal
  • Web: converse.js
  • Desktop: Gajim

Mit xmpp noch nicht verfügbar sind @mentions, pinning, starring oder das Weiterleiten einzelner Nachrichten per E-Mail. Dies gibt es sehr wohl etwa in Rocket.Chat, daher werden wir damit die nächsten Erfahrungen in einem Feldversuch sammeln.

Der Screenshot zeigt den  xmpp Client movim für Web. Dieser wird von Niederländern entwickelt und lehnt sich an Telegram oder auch RocketChat an. Mit diesen sind wir in Kontakt, damit noch zwei Features ergänzt werden, die wir für einen weiteren Test Case benötigen.
movim hat als Besonderheiten noch Benachrichtigungen sichtbar zu machen (zB von einem Atom Feed) oder Nachrichten von Webseiten zu teilen sowie ein Admin interface für den ejabber Chat Server.

Mit der fairkom Gesellschaft wollen wir entsprechend den Erfahrungen aus diesen Recherchen einen eigenen Chat Server in Betrieb nehmen, vorher wollen wir aber noch die Anmeldung und das Login vereinheitlichen.

Hier folgt nun ein ausgezeichneter Artikel übernommen von Erlang Solutions, welcher die Chat - Kulturen sehr gut beschreibt (Lizenzhinweis am Ende).

 

  1st gen: 1998-2008 2nd gen: 2009-2014 3rd gen: 2015-...
Most valuable feature Presence and instant chats Always-on mobilility Group collaboration
Synchronicity Synchronous Asynchronous Continuous
Protocol IRC or proprietary XMPP or proprietary Websocket based
Examples ICQ, Yahoo!Messenger, AIM, MSN, Gadu-Gadu, QQ, NateOn WeChat, LINE, WhatsApp, Google Talk/Hangouts, KakaoTalk, Viber, Telegram, Signal / Open Whisper Slack, HipChat, Otalk, Kaiwa, Zulip, Mattermost, Let's Chat, Rocket.Chat
New features Gadgets Message actions, mentions, stickers Full text search, persistency, integrations
Platform Desktop-only Mobile-only, mobile-first Mobile AND laptop
Windowing mode Dual-window Full screen Single-window
Presence Presence-based Presence is secondary Presence is secondary
Group chat Secondary Secondary Central
Multiple end points Multi-client Multi-device, account-based Multi-device, account-based
Media File transfer (propos+accept) In-chat media In-chat media, with list of media and search
Voice Secondary Important Secondary
Status Dead Alive and still growing New and growing fast
Monetisation Ads Users are the product Subscriptions, integrations

 

Real-time, synchronous text messaging has come a long way!

In the “old days” (some would call that pre-history), the concept was simple - we used Talk on Unix to log in to a machine remotely and chat with another local user on that machine. There were no chat rooms, authentication, authorisation or basic encryption that seem so obvious, ubiquitous and indispensable today.

Then IRC came as an open standard, introducing multi-party chat rooms as a main feature. Different flavors of non-standard extensions, and disjointed networks soon followed. One of the most irritating issues was (and still is) the infamous netsplits. The sum of all IRC networks now only totals fewer than a stable 1 million users worldwide.

This era was very diverse and laid the ground for future developments; however the concept introduced in this period was just that of "chat". The next step - moving to actual Instant Messaging - is where our journey really begins.


1st generation: instant messengers, ICQ-like

The instant messenger revolution started around 1998 and 1999 with ICQ as the very first successful player. Then Yahoo!Messenger, AIM and MSN/WLM followed, as well as Gadu-Gadu, QQ, NateOn, LiveJournal, MySpaceIM, Google Talk, and many more, including the only open standard, XMPP or Jabber (not mentioning SIP/SIMPLE here).

a. Sporadic internet connection

In those times, most people connected to the internet using a landline modem. It was slow, expensive and worst of all - loud and annoying (who can forget the “iiiiiiii-eeeeeeee-iii” sound accompanying every session?). No one would connect to the internet for longer than one, maybe two hours since it blocked the phone and ended up costing a fortune.

b. Presence, synchronicity of the user experience

In the very first instant messengers, the concept around which everything revolved was presence, as a core and central feature. When you opened the app you saw a roster listing all your contacts and their status indicated whether they were online or not. Large, blinking icons informed you who was available, busy or away at the moment. Prominent sound notifications let you know the current status of your friends. When logging in, by default, you too were broadcasting to the world that you were online. In both directions, people just wanted to know in real-time who was available to chat, and then engaged in discussion sessions.

c. Desktop-only client

Obviously, the first generation of IM was desktop-only (laptops were still a minority). That meant low bandwidth (before ADSL), but large, comfortable computing resources. Mobile phone screens were just about one or two lines of text in large black LCD pixels, and had no data network (spoiler: the second generation of IM is all about the smartphone).

d. Multi-window software is cool

Since presences and availabilities in the contact list were the main focus, the second most important was the chat window(s). IM clients were multi-window (one window per chat) or dual-window (all chats in one window with tabs). Multi-window software was a cool thing at that time, in photo editing, non-linear video editing, and many more. These were more or less targetted at power users.

e. Group chat as a secondary feature

Group chat was a secondary feature. Lots of room types and features were available: public vs hidden, permanent vs short-lived, whitelists vs blacklists, kick vs ban, etc. You could join group chats after you connected, but participants were not necessarily there all at the same time. Anyone could be suddenly disconnected, for whatever reason (modem stability, or parents taking over the phone). You would miss all the conversations happening while you were offline, and besides this, rooms were cluttered with automatic status messages (Mary joined, John disconnected).

f. Offline messages and logs, easing the suffering of absence

Offline messages (sometimes called store-and-forward) and chat logs were made to fix a few of this “chat while I’m not online” issues (the equivalent of the “leave me a message” phone voicemail). But this feature was very limited, being designed only as a workaround, and IM servers being made only for real-time routing and not for storage.

g. File transfers

The defacto standard of file transfers was more or less fixing a problem that was not specifically linked to IM, but was rather leveraging the intrinsic synchronous nature of its usage. At the time sending “large” (2MB) files by email was impossible, either because waiting for the progress bar to finish and closing the mail composition window took too long, and/or the servers had file size limits. It was a fully synchronous user experience: the sender proposes a file and waits, and the receiver accepts the file and waits for the progress. Sometimes the file proposal could wait for a long time, or even timeout or fail. Most had limited bandwidth, and some had censorship (that would be hidden as “security”).

Also, the file transfer was out of band, meaning it was made outside the context of a chat. No marker in the conversation made you aware you had that file transfer in the past. You had another window for all past file transfers.

h. IM with VoIP, not “VoIP applications”

Some IM services offered voice and video in a time when computers were not equipped with built-in webcams and the quality was poor. Most voice and video features were one-to-one and almost none was compatible with landline and cell phones.

i. Specificities: gadgets ;-) and ads :’(

I couldn’t end this review of the first generation without mentioning all the now defunct gadgets, such as buzz/nudge/wizz/attention, and other mood, activity, music, video.

To conclude, the first generation of IM was highly synchronous, with a lot of technology and user experience limitations, that were later fixed or mostly addressed by the second generation of IM...

2nd generation: mobile messaging apps

The mobile messaging revolution started with the massive consumer growth of multi-touch smartphones, beggining with the iPhone in 2007 and Android in 2008. The new buzzword became “app”. Not software, client, user agent, or even application, just app. A new generation, a reboot.

Among the most well known mobile messaging apps of the second generation were WhatsApp (which was based on XMPP in the beginning and then evolved). In no particular order the other 2nd generation players are WeChat, LINE, Google Hangouts, Viber, KakaoTalk, Telegram, ChatON, Hike, Kik, and probably Facebook Messenger (who are still competing with WhatsApp, despite owning them).

I am not including here BlackBerry and BBM (BlackBerry Messenger) which were aimed at a more confidential, elite market. Also at the time Skype was only a P2P voice and video call software.
a. Always-on, asynchronous user experience

With the advent of the smartphone, having an always-on internet connection became the norm. Being offline became only an exception, a temporary state. This brought on a massive change in all our lives: from being always OFF (except for a few minutes), to always ON (except for a few minutes).

Mobile messaging apps acquired most user experience from SMS and MMS: as they were always on a network, people would not feel the need to respond immediately anymore. People were finally freed from the obligation to stay on their IM app together at the same time.

This is how IM became asynchronous. Just send a message, you will probably have a reply, someday.

b. Presence became secondary, sometimes useless

Consequently, presence became significantly less necessary. App makers and users simply started considering presence as a bandwidth, scalability, and battery killer. Presence either became secondary or even was completely removed.

c. Mobile-only

The smartphone revolution was so fast and crazy that the first app makers completel forgot the deskop. Lots did mobile-ony apps. And in the later days of the 2nd generation, most of them evolved slowly to mobile-first apps. This was a great period as all the usages were completely reviewed from the ground up and a fully new user experience was created.

d. Full screen is obvious (no windows)

Smartphone apps are naturally full-screen, thus mono-window by nature, maximised to fit screen. Of course some tablet/phablet makers propose types of split screens, but the app stays in one window. The late days of 2nd generation IM saw desktop apps and web apps appear. Desktop apps were mostly based on web technologies, probably for cost saving. Most of them were mono-window, with some weird exceptions.

e. Simplified group chat, still a secondary feature

All the complexities of group chat usage and administration went away. Group chat was made simple, possibly too much so sometimes in second generation IMs finding the admin, if there is any, is a difficult mission as there is no UI, only a hidden command line. That is due mainly to the race between IM makers and the limited possibilities of small smartphone screens. Also implementors kept it to a bare minimum, as that was enough to satisfy most users’ needs. Last but not least, one-to-one or group chat conversations were treated equally.

f. Multi-device: transparent real-time synchronisation

The very limited offline messages of the 1st generation were replaced by a full blown account archive approach for all the messages. Synchonisation of the devices with the central message archive has been made transparent to the user. All the apps connected to a unique account, whether they were on smartphone brand, or another tablet brand, were able to be in full sync, in real-time. The multi-device user experience became the following: start a conversation on one device, continue on another, and finish on a third one. All sent and received messages are consistent across devices, in real-time.

g. In-chat media, no more “file transfer”

File proposition and wait for acceptation has been phased out. Since the connectivity was not predictible, one could not afford the wait for a contact to be online and willing to accept the file. So, IM makers promoted media files as the equal of messages. Photos and pictures, short sounds & videos, potentially any file, plus location could be sent like any other message, and received, notified and archived exaclty like any other message. The user experience for the sender was just a simple file selection, and click/tap send button… then put back the smartphone in the pocket, or type your next message. The app simply takes care about the rest. The user experience for the receiver was just a single notification for the media, like any other type of message. Of course, most apps offer an option to download large files only on wifi, in order to protect the data plan from overspending. The propose/accept UX was disrupting the conversation. With in-chat media, the UX became a single, continuous flow inside the same conversation.

h. Voice and video, better hardware

Real progress was made on voice and video, thanks to better hardware, with de facto standardisation and front camera in smartphones, and later front webcams in laptops. Also, in the late 2nd gen large bandwidth improvements were made thanks to 3G, and then 4G on mobile, and ADSL and optic fiber at home. People now expect most voice calls to work at least with the same quality as regular cell and landline phones. Video calls are much more sensitive to bandwidth limitations, thus users are still prepared to accept some glitches (although the degree of acceptance varies depending on whether it’s personal or business use).

i. Specificities: message actions, mentions, and… sticker craze!

On IM, typos are expected, as you type fast on a small keyboard, and you have auto-correct. Now, you can make sent message corrections. After a message was sent (and probably received), you now have the possibility to edit it on both ends (including archive).

Also, lots of apps propose the capacity to reply to a specific message, to quote or forward or resend a message, and to mention someone with the @nickname notation. These features have made chat much more mature.

A feature has also been borrowed from SMS and largely enhanced: the markers for sent, received, read. That was showed by ticks, labels, or simply metadata text. Some even went further with watermarking a conversation, saying a contact has read all messages until this one.

Stickers, stickers and stickers! Lots of IM apps proposed stickers. You could send a conversation-wide image, cuter and richer than smileys. Stickers are also much lighter than images, as they are just a suite of characters sent over the wire, sticker images being stored locally on each device. These enabled people to convey emotion much more rapidly and with more fun. Some made stickers a source of revenue, selling sticker packs, or letting advertisers pay to propose free-to-the-consumer, branded stickers packs.
Summary

The second generation is the era that made IM asynchronous, and long-lived. Much like the SMS experience from which it borrowed a lot, but also to which it added great maturity.

The second part of this article is focussed specifically on the third generation of Instant Messaging and is coming next week. Stay tuned and get a grasp of the undergoing revolution.

3rd generation: group messaging

The group messaging revolution began around 2015-2016, with the advent of Slack and HipChat. Other services in the third generation are Zulip, Otalk, Kaiwa, Mattermost, Let's Chat, Rocket.Chat. Since the third genration revolution is still ongoing, observing and understanding what is happening is much harder. We don’t yet have any strong long-term feedback, and the race is ongoing. But again, a new generation, a new disruption is evolving.

a. Continuous computing experience

The context evolved once again. Mobile have overtaken sales of other forms of computer. People use smartphones as their first (sometimes only) device. But, overall, the smartphone revolution calmed down a little. Many companies now practice “BYOD” (“Bring Your Own Device”), allowing employees to use their personal smartphone and laptop for work. People generally use a laptop at work, a tablet on the couch, and a smartphone on their commute, coffee pause, and lunch break, not to mention parties, family dinners and toilet breaks.

b. Presence makes a shy comeback, but remains a secondary feature

We learned that in the 2nd generation presence was missed by users when removed from apps, or still used as a secondary feature. When it comes to presence, there is continuity between 2nd and 3rd generation, but with a slight come back: although presence does not always comes as a central feature, it is only one click or tap away.

c. Mobile AND desktop in the real world

The 3rd generation IMs come in different flavors: the app is available on one or two (or more) mobile platforms and on one or two (or more) desktop platforms. These software pieces may not come with feature parity due to all these apps’ lifecycles, but the overall user experience is mostly there already. On desktop, software is often available in the web browser. And when it comes as desktop software, it is often just the web app that is delivered into a packaged software working on web technologies.

d. Single window mode, for simplicity, and flat design

All the 2nd generation IMs that now provide some desktop experience offer single window software. Not because of the web technologies they are based on, just because multi-window is far too complex for massive user bases.

The responsive design of the few desktop apps and websites of the 2nd generation has been generalised, and consequently brought about the 3rd generation. If you resize your desktop app window to a tiny size, you have a UX close to that of the mobile app (minus multi touch input), when you resize it to average size, you get a tablet-like experience, and when you really enlarge it, you have the full blown desktop experience.

Flat design is the default of the 3rd generation, whether it is done in Apple style, Material design by Google, Metro-style by Microsoft. This is a minimalist UI design genre or language, getting away from the skeuomorphic paradigm, that immitates previous generations’ interfaces. Even 2nd generation apps are now phasing out non-flat design.

e. Group chat is the main core feature

Group apps of the 3rd generation IM naturally prioritise group chats, or at least do not relegate them as a secondary feature. Conceptually, in group apps, the chat rooms or channels are not just an extension of one-to-one chats, but rather one-to-one chats are a downsized version of group chats. That is one very small technical difference from 2nd generation that makes for a totally different user experience.

Other more obvious differences are groups having persistent files and links and stars/favorites, and room/channel notifications. And let’s not forget full text search, but more on that in a bit.

Some may argue that the 3rd generation IM is a return to what IRC was: a group chat at the core. Indeed, but it undeniable that the 3rd generation IM is much more user-friendly and technologically advanced than IRC clients.

f. Continuous devices synchronisation

The 3rd generation is multi-device by default, by nature, and because it is an extension of the 2nd generation which was mostly mobile-only (at least at the beginning), as it completes the devices’ scope with desktop apps. The idea is to offer continuous flow, not only with message sync, but also with unread and file sync, outside the app.

The 3rd generation is still as asynchronous as the 2nd generation, but it adds email notifications. These are slightly delayed notifications (non real-time), for when you can’t participate in conversations. These emails tell you what you have missed, via a bulk of messages at a time. It might not seem like much, but it fills a gap and underlines the missing parts of asynchronicity.

g. Persistence of in-chat files

In-chat media files have stayed and the 3rd generation brought some improvement, with persistent files directories, often materialised in sidebars. This is an advancement over the in-app galleries of the 2nd generation, which only allowed you to scroll through pictures.

The Ffile transfer of the old days might be missing now, for file exchanges in fire-and-forget mode (out-of-chat propose+accept UX), with no archive, and no search. The common workaround is using external, third party file sync services such as DropBox, Box, Google Drive.

h. Voice and video: stagnation or regression

Since 3rd generation apps focus on groups, multi-party VoIP is the target, the Holy Grail that the 2nd generation has not achieved. This is very difficult to achieve, which may be why implementors are taking their time. The goal might be not to disappoint high expectations.

i. Specificities: full text search, stars, integrations’ craze, ChatBots

One of the most prominent - if not the most prominent - features of 3rd generation is full text search of messages and file. One can finally search old messages and conversations, same as you can do with emails. In a multi-device context that can be very useful, for example when you recall the general idea of a conversation but not all the details. Some seek to monetise the archive, by limiting a free user’s searches.

One can also star/favorite or pin almost everything! It started shyly in the 2nd generation, but the 3rd generalised it. Starred/favourited content generally features in sidebars, sometimes adding unnecessary clutter to the overall appearence and UX.

After group focus and search, the integrations’ craze is possibly the most important trait of the 3rd generation. Integrations are hooks or links to third party apps and services. They help fill the gap between these apps and the chat system. It helps avoiding disruptions between chat and other apps: team mabers always have to switch between apps, now they have all in one place and in real-time. It increases retention and engagement on the chat app.

ChatBots are also becoming huge, since they can help with some serious tasks, and even with some fun, unproductive (ahem procrastination) ones, and the underlying AI is evolving rapidly.

Integrations and ChatBots are tightly linked to the ChatOps movement. ChatOps is “everything of the business process on chat”. Integrations and ChatBots progress and refine together, enabling more team fluidity.

Integrations are huge, not only because everybody jumps on the bandwagon, but also because it enables apps become a real marketplace. The 3rd generation IMs are generally selling subscriptions for teams (with free tier), and integrations and ChatBots enlarge their capacity to monetise.

Summary

The third generation made the IM a continuous flow between all your devices, where nothing is ever missed. 3rd generation IM defining traits include: group chat first, full text search, email notifications, integrations and ChatBots.

As it is an ongoing process, the usages will refe during the coming period.


Generation 2.5: specialisation, segmentation

While the 2nd and 3rd generation IMs want to reach the mass market, some 2nd generation apps have chosen more specialised markets. This is the generation 2.5, which sits between generation 2 and generation 3.

Ephemeral messaging

Ephemeral messaging is about self-destructing messages. These could be photo-only or photo-first apps, but also disapearing text messages. Snapchat is the best known example.

Secure messaging

Encryption is at the core. Sometimes, expected features such as multi-device are missing. Encryption is evolving so fast that some apps become obsolete rapidly. Recent, highly publicised concerns surrounding privacy and transparancy will not be resolved quickly, easily or on a supra-national basis.
Mesh networks

Mesh networks use no central servers, rather using mostly the cell network, bluetooth, and wifi. They are used for co-located and synchronous events such as festivals or protests.
Side notes

Here are some general observations, that might not obviously fit in one generation or another.

Account creation and contact management are features that have evolved a lot. And there is no clear pattern shaping up that enables the discerning of a general directions tied to a specific generation.

Message receipts (such as sent, received, read/seen) appeared around the 2nd generation as these imitate the SMS experience, but then disappeared from the 3rd generation. They may yet make a comeback as auditability is clearly important in corporate and regulatory contexts.

Speaking about monetization: the 1st generation was all around ads, and interestingly enough around avatars in Asia. The 2nd generation is about gathering user data for big data and profiling/targetting, and interestingly enough around mobile-local-social marketplaces in eastern countries. The 3rd generation is about building subscription platforms, with integrations and ChatBots marketplaces.

Another interesting trend are the apps adding IM to their core functionnalities, despite not being IM apps. Indeed some apps add basic to complete IM features, in order to generate acquisition, engagement, and retention. To give an example, IM is a helper for the main focus of photo sharing and dating apps.

The adoption and social impact of IM is different: the 1st generation was about reaching a mass market, and is now dead (some zombie apps are still walking around). The 2nd was about the generalisation of IM with an initial focus on mobile, but still with disjointed networks, and incompatible apps. The 3rd generation will probably not take over the 2nd generation since the focus is on groups and collaboration. So 2nd and 3rd generations are here to co-exist for a while, and will probably feed off each other in features and usages. Of course the 3rd generation is still in its infancy, so expect more to come at a fast pace.

As usages and technologies evolve with upcoming generations of users, the experience will continue to evolve rapidly and massively.

Most of these apps offer absolutely not interoperation: you have to create an account to chat with a contact with that app. Of course, the only open standards protocol XMPP enables federation, much like email does: you can send a message to a contact on another service.

License: CC-by-sa Roland Alton (german part) und Nicolas Verite (english part)

Submitted by rasos on