Last week, I read a post at High Scalability talking about getting research out of academic environments. There is a lot of technology that starts as academic research, but there is a problem with some research being adopted by other technologists. That post has a very good explanation of why:
Over the years I’ve read a lot of research papers looking for better ways of doing things. Sometimes I find ideas I can use, but more often than not I come up empty. The problem is there are very few good papers. And by good I mean: can a reasonably intelligent person read a paper and turn it into something useful?
That may not make sense at first glance, because you have not read an academic paper recently. Just head over to the ACM, search for an interesting research topic, and determine if you have an idea of how to use this new technology. Typically, these papers are typically unreadable for the average developer. Again, High Scalability has a nice quote:
There’s a lot out there in the literature that we could be making use of right now, but it’s closed off from the people, i.e., developers, who can turn this research into gold. And it’s largely closed off because researchers don’t consider developers as an audience and they don’t write their papers with the intention of being applied.
The problem here is that academic success is not seen as how many people have adopted and used the technology. Academic success is normally seen in the form of several research grants and tenure. However, academia is not the only group publishing information that is not immediately useful to developers, just look at the average technology specification from W3C. As an example, go to the XQuery Current Status page. Because I am currently working with XQuery, I understand that there are several parts of the technology. If you are a developer that wants to learn about XQuery, the last place you are going to would be the W3C page. Why? Look at that page again. You see that there 6 current standards:
- XQuery 1.0 and XPath 2.0 Data Model (XDM)
- XQuery 1.0 and XPath 2.0 Formal Semantics
- XQuery 1.0 and XPath 2.0 Functions and Operators
- XSLT 2.0 and XQuery 1.0 Serialization
- XQuery 1.0: An XML Query Language
- XML Syntax for XQuery 1.0 (XQueryX)
Try and read the Functions and Operators specification. You may be able to understand some of it, but it is not entirely clear how you are going to use it. Another issue with this is that there are 6 different specifications required to support XQuery 1.0. In addition to XQuery being a more niche technology, the barrier to entry is very high. Thankfully, there are books that help simplify the process of learning these new technologies.
Could we make some of these complex technologies more adoptable? Let’s take Java as an example. I have no idea how many specifications are required to implement the full Java and JEE stack without EJB. However, one of the main reasons Java gained significant adoption is that there is typically a reference implementation available. It may not have been immediately available, but fairly soon after the specification was finalized there was an implementation. In some cases, an implementation from outside of the standards body was chosen as the reference implementation. Tomcat is an example of this when it comes to the JSP/Servlet specifications, even though it was eventually forked for further specifications.
I have talked about the difference between action and ideas before. That is even more apparent in this divide between academia and developers. A book I am reading, Innovation: Need Of the Hour, has an excellent example. Sramana Mitra talks asked a question to the person who “popularized the idea that random accidents and uncertainties … determine the course of history”. Obviously, this is a bit controversial:
“What are you going to do about your thesis?”
His answer: “I don’t do. I just think and write.”
This is a serious issue when it comes to those people that say “It works in theory, but not in practice”. Theory is the basis of much knowledge that you have, but if the theory does not work in practice, then the theory is useless. There are plenty of theories or algorithms that work out fine, but in real world situations they become unusable. When you look at algorithms, some are CPU-bound, some are memory-bound and others may be I/O-bound. Which algorithm you choose is highly dependent upon your situation. In theory, any of these algorithms would apply. However, if there is a reference implementation of these algorithms, you can quickly determine which algorithm will work in your environment. This would increase adoption as more people can easily implement the technology.
Open source reference implementations are even better for developers. Many developers do not like reading books to learn about technology, they learn from reading, writing and executing code. An open source implementation gives the developer some code to read that actually works. The reference implementation does not have to be a reusable library, and it could even execute slowly, but if it exists, developers can learn from it and improve it.
So, the next time you have some interesting new technology in some academic paper, develop a reference implementation that people can use and look at. The adoption of that new technology could come very rapidly.