Microservices Part 3 – How to call a Microservice

I’m busy working away on a new Microservices using Spring Boot course for Virtual Pair Programmers. I hope my next blog post will be a draft running order with an estimated release date: in the meantime as promised I’m going to look at how to call a Microservice.

Along the way I’ll point out how Spring Boot can help – at the same time this is helping me to decide what needs to be on the new course.

As per last week, my old monolith (which is eventually going to be broken down until nothing remains) is going to call a new Microservice, the “VAT service”. It has a single responsiblity, to return the Tax due on an amount, based on the country of residence for a customer.

It’s easy to make these things simple on a training course – in real life various governments have conspired to make VAT a living nightmare, so really this service has to deal with IP addresses, Physical Address and location of requesting bank. But that’s ok, the microservice still has a single responsibility. Don’t be afraid to build microservices which may feel trivially small – that’s part of the point (and they tend to grow anyway).

Obviously, we need to call this microservice. How?

Approach 1: The easy way – a naked REST call

Although I advised in the previous blog that you can and should consider other remoting solutions such as gRpc, let’s assume we’re going with REST. It’s kind of standard.

As you’re reading this blog, you’re probably a Spring fan, so let’s also assume that the caller (client, at present the monolith) is using Spring. So the natural choice is to go for the RestTemplate.

We’ve covered the RestTemplate extensively in our Spring Remoting course, so I won’t labour the point. But something like:

String countryRequired = "GBR";
Percentage vatRate = template.getForObject("http://localhost:8039/vat/{country}",Percentage.class,countryRequired);

Almost a no brainer. Things to consider:

  1. We need to get rid of the hardcoded URI – ideally we would have a service discovery solution as part of our architecture. Itouched on this briefly last time, but Spring Boot has a plugin which wraps up the very easy to use Eureka, which was originally

    built by Netflix. Although we use Kubernetes on our live site, I’ve decided that as Eureka is so tightly integrated with Boot, it

    would be a shame not to cover it. So it’s going to be on the course!

 

  • What happens if the VAT service is down? In a Microservice architecture, you must assume that at any one time, at least one
    service is likely to be unavailable. Again, further Netflix components can help, and Spring can easily integrate with Ribbon (for load balancing) and Hystrix (for Circuit breaking – more on circuit breaking in a future blog post).

 

Together, these two sub-frameworks can lead to a very robust architecture. I’ll be making sure that our practical work on the course explores this in full.

Approach 2: using Feign to hide the remote call

Naked rest calls are all well and good – they’re simple – but I always get the feeling that I’m breaking an abstraction. I
don’t want to feel that I’m making a Http call(*) – as a business programmer, I’m calling a service and that’s how I want to think
in the code.

(*) Note: this will make some people angry. When working on distributed systems, we must never forget the Fallacies of Distributed Computing, in this case we must never forget that we are making a remote call and it can 1) fail and 2) take a long time. Many argue that by abstracting away the remote call, we are making it easy to forget this. It’s a good point which I accept and remain mindful of.

It would be great if I could call this service using idiomatic Java/Spring/Dependency Injection, a little like this:

public class BlahBlah
{
   @Autowired
   private VatService remoteVatService;

   public void billCustomerOrWhatever( .. params .., String countryOfOrigin)
   {
      Percentage vatRate = remoteVatService.findVatRateForCountry(countryOfOrigin);
      // blah blah blah      
   }
}

And we can! Yet another element of the Spring Cloud library is called “Feign“. I admit I didn’t know about this until recently (how do you keep up with Spring when it expands faster than my brain cells can work?) – I’ll be covering it on the course but it’s as simple as declaring the Interface in the usual Java way:
public interface VatService
{
   public Percentage findVatRateForCountry(String country);
}

Rather like with Spring Data JPA (which I covered in the Spring Boot course), you do NOT implement this interface – it’s done for you via a generated runtime Proxy.

You do need to add a few annotations, so that the generation knows how to translate the Java into REST calls. Cleverly, we use standard SpringMVC annotations (of course usually these annotations are used when defining the server side – this is the first I can think of where I’ve used the annotations client side!)

@FeignClient("/vat")
public interface VatService
{
   @RequestMapping(method=RequestMethod.GET,value="/{country}")
   public Percentage findVatRateForCountry(@PathVariable("country") String country);
}

Beautifully, this all integrates with the Ribbon load balancer that I mentioned above, so if we’ve replicated the service on multiple nodes, it’s easy to provide failover and fallback behaviour.

Approach 3: Use Messages

The call to the VAT service probably needs to be synchronous, because we absolutely need to know the answer before the user can proceed with what they’re doing. But in many cases, a message driven solution is another way of building a robust system.

A working example is that our system needs to record viewing figures. Every time a video is watched by a subscriber on our site, we record this as a “watch”. We use the data to decide which courses are a hit, and we can also identify unusual viewing patterns (this is a polite way of saying “we can find out who is using site scrapers”).

If we have a microservice which is responsible solely for recording viewing, we can of course use REST to log the view.

Notice the new service has its own private database, as described in part 1. We’ve chosen MongoDb – it has its detractors, but it will work well for this type of data. We can easily store large blocks of the viewing figures in memory, so it’s going to be easy to do fast calculations and aggregations. When this was handled by the monolith in MySQL, doing even basic calculations was grinding the whole system to a halt. One of the joys of microservices is we can make these decisions without too much agony – if it doesn’t work out, we can tear down the whole microservice and replace it with a different solution. I call this “ephemerality” but it’s a pompous word so it will never catch on.

This call doesn’t need to be synchronous – the video can safely play even if we haven’t yet logged the viewing. There are ways of
making asynchronous REST requests (Spring features an @Asynch annotation which starts a new thread – I’ve never covered this on any course, but I will maybe get around to that someday).

But this is a great use for messages. Instead of making a call to the service, we could just fire off a message to a queue. We don’t care who consumes the message – we just know we’ve logged that message.

We would then just make the ViewingFigures service either respond to messages on that queue – or, even better, we could use a Topic instead of a Queue. With a Topic, multiple consumers can register an interest. So, in the future, if new services are built which are also interested in the EVENT that a video has been watched, well, they can subscribe to that topic as well.

A Topic has multiple subscribers which can be added over time. Note that on AWS this is implemented using the SNS, Simple Notification Service

This gains us the robustness that we desired above, without the need for extra plumbing such as circuit breakers and load balancers. If the viewing figures service goes down, it’s no problem to the monolith as it isn’t calling it – it’s just sending a
message to a queue or topic. The queue will start to backlog the messages until the service comes back up again, and then the service can catch up on its work.

Things to think about: it is essential to ensure the queue has an extremely high uptime. With a few mouse clicks (* see footnote), Amazon SQS
automatically provisions a queue which is transparently duplicated across multiple Availability Zones (data centers). You can’t assume it will NEVER go down and you must code for this on the calling side. In this case, I would log the exception and carry on, it’s no disaster if we miss some viewing records.

Although we’ve covered messaging in standard JavaEE (and we have a course covering this on WildFly releasing soon), for some reason we’ve never covered messaging for Spring. So that’s going to go on the new course as well!

As always, I’m sorry for the long blog post, I didn’t have time to write a shorter one – I’m busy working on the new course!

(* footnote) Edit to add: Ahem, I meant, of course – “with a simple script, under source control, using a tool such as Puppet, Chef or Ansible”. That needs to be a course too!

Microservices, Part 2 – how to deploy

Here’s the next in a short series (I think it will be three parts) where I’m musing about Microservices. My challenge is that VirtualPairProgrammers absolutely needs a course on it, but I’m not sure what form that will take. After writing these first two parts, I’m beginning to think that for development, we just need to extend our existing Spring Boot and JavaEE/Wildfly courses, but a further course on Deploying Microservices will be needed. This blog post will focus on that.

In part 3, I’ll return to the “dev” side of things and look at how using events can make your system more loosely coupled.

In Part 1 I described the overall concepts in Microservices, and it turns out to be not too complicated:

 

  • Services aligned to specific business functions
  • Highly cohesive services and loose coupling between them
  • No integration databases (meaning each service will typical run its own data storage)
  • Automated and continuous deployment.

Actually implementing a microservice is not too hard. Designing an overall architecture where the services collaborate to achieve an overall goal, that’s a bit harder – but what’s really hard is deploying a microservice architecture. To put another way, the real magic in microservices is in the “Ops” rather than the “Dev”.

Unless you’re planning on rolling your own infrastructure tools (Netflix did this and they’ve opened sourced them – more later), you’re going to rely on open source tools, and lots of them. There are hundreds – probably more – that you could consider. It’s overwhelming, and every day, new tools are emerging. To try to get you started on the Microservices path, this article is going to look at a very simple microservice and the not-so-simple tools needed to get it running.

Note: this article is not intended to be authoritative. These are just the choices we’ve made and the reasons why. There will be other solutions, and plenty of tools that I’ve never even heard of. I consider Microservices a journey, and our system is certain to evolve dramatically over the coming years.

Also, I’ll be sticking with the tools I know – so for the implementation of the actual service, I’ll probably use Spring Framework, JavaEE or associated technologies. If you’re coming from other languages, then of course you will have your own equivalents.

Our System at VirtualPairProgrammers.

As described in part 1, our website is deceptively simple – it’s a monolith, and behind the facade of the website we’re managing well over 20 business functions. It has worked well for us, but it was getting harder and harder to manage. So we decide to migrate to a microservice architecture.

But so much work to do! At least 20 microservices to build! Where do we start?

Well, one appealing thing about microservices (for me) is you don’t have to do a big bang migration – you can slowly morph your architecture over time, breaking away parts until the legacy can be retired or left in “hospice” mode.

So we started very simply – we have a business function where we need to calculate the VAT (Value Added Tax*) rate for any country in the world. In the monolith, this code is buried away in a service class somewhere – it’s a great candidate to be its own microservice:

Simple to describe, but actually deploying this raises some questions:

How to implement the service?

As stated, the “dev” part isn’t too hard – Spring Boot is a perfect fit for Microservices, and we’re very experienced with it here at VirtualPairProgrammers. But really there is infinite choice here, you could for example implement this as an EJB in a Wildfly container. Following the guidance in part 1, this service will have its own data store, and it doesn’t really matter what that is. For a simple service like this, we might even keep the data in memory and simply de-deploy the service when VAT rules change.

Should the VAT Service be deployed to it’s own Machine (Virtual Machine)?

As mentioned in part 1, we want to be able to maintain total separation of the services, but at the same time we don’t want to incur the cost of running a separate Virtual Machine. This is where containerization comes in.

A container differs subtly from a Virtual Machine. A VM has it’s own operating system, but a container shares the host’s operating system. This subtle change has major payoffs, mainly that a container is very lightweight, fast and cheap to startup. Whereas a VM might take minutes to provision and boot, a container is up and running in seconds.

A traditional set of Virtual Machines – each VM has its own Operating System…

 

…but containers share the host’s operating system
 

The most popular containerization system (quite over hyped at present) is Docker. This book is an excellent introduction, it’s a practical book and definitely helped us to get started:

How do we call the now-remote service?

 

The usual answer here is to expose a REST interface to the VAT service. This is trivial to do using Boot or JavaEE.

But in this specific example, we are NOT exposing this API to end users – it is only going to be called from our own code. So, it’s actually not at all necessary to use REST. There will be many disagreements here, but you could certainly consider an RPC call! RPC libraries such as Java’s RMI or more generic ones such as gRPC (http://www.grpc.io/) have a bit of a bad name, partly because the binary formats are non-human readable. For service-service APIs, actually RPC is fine – they’re high performance and work well.

(Human readable forms, mainly JSON over HTTPs [aka REST if you’re not Roy Fielding] are the right choice for APIs that are being called by user interfaces, especially JavaScript frameworks).

(Something to think about here, we’ve replaced a very fast local call with what is now essentially a network call. Remember this will be an internal network call – see the stackexchange discussion here.)

 

How does the “client” (the monolith) know where the microservice is?

 

It wouldn’t be a great idea to have code like this:

 

// Call the VAT service
VATRate rate = rest.get("http://23.87.98.32:6379");

I hope that’s obvious – if we change the location of the service (eg change the port it is running on), then the client code will break. So: hardcoding the location of the services is out.

 

So what can we do? This is where Service Discovery via a Service Registry comes in.

There are many choices of Service Registries. Actually Java had a solution for this back in the 1990’s, in the shape of the JINI framework. Sadly that was an idea ahead of its time and never caught on (it still exists as Apache River, but I’ve never heard of anyone using it).

 

More popular – Netflix invented one for their Microservice architecture, which is open sourced as Eureka. I haven’t used this, but I understand it is quite tied to AWS and is Java only. Do let us know if you’ve used this at all.

We are using Kubernetes (http://kubernetes.io/) because it provides a service registry (by running a private DNS service), and LOTS more, particularly…

What if the service crashes?

 

It’s no good if the microservice silently falls over and no-one notices for weeks. In our example, it wouldn’t be too bad because the failure of the microservice would lead to a failure of the monolith (we’d see lots of HTTP 500’s or whatever on the main website – but once we’ve scaled up to multiple services, this won’t be the case). This is where orchestration comes in – in brief this is the technique of automatically managing your containers (orchestration is a bigger concept than this, but for our purposes, it is containers that will be orchestrated). The previously mentioned Kubernetes is a complete Orchestration service, originally built by Google to manage (allegedly) 2 billion containers.

 

Kubernetes can automatically monitor a service, and if it fails for any reason, get it back running again. Kubernetes also features load balancing, so if we do somehow manage to scale up to Netflix size, we’d ask Kubernete to maintain multiple instances of the VAT service container, on separate physicals instances, and Kubernetes would balance the incoming load between them.

There aren’t many books available on Kubernetes (yet) – but at the time of writing, the following book is in early release form:

So the overall message is, you’re going to need a lot of tooling, most of it relating to operations rather than deployment. In Part 3 (probably the final part) I’ll look at another way that services can communicate, leading to a very loosely coupled solution – this will be Event Based Collaboration…

Microservices – Part 1, what are they?

We’re getting a lot of requests lately to produce a Microservices course. It’s been high up on our (growing) todo list so it will definitely happen, but for some reason I’ve held off. I think this might be because we’ve tended to favour courses based on specific tools and concrete techniques rather than high level architectural stuff.
But we’re definitely doing it – possibly as a standalone course, possibly spread across multiple videos (the upcoming courses on Messaging in Spring Boot and Wildfly would both be good candidates to contain some microservices, and our series on DevOps would be a good place to cover how to deploy microservices). Or maybe we’ll do both, a short course addressing the overall ideas and then we’ll apply those ideas in the appropriate courses.
This will happen over the coming months, in the meantime here’s a bit of an unstructured exploration of Microservices.

What are Microservices?

Simply put, the Microservice movement is a shift away from the old-school technique of building huge, single applications that have cross business scope (these are commonly known as Monoliths).
As an honest example, our website at VirtualPairProgrammers has been developed as a traditional monolith – a single WAR file containing, effectively, our entire business.
Ok, we’re not a huge business. We’re not Spotify (yet). It’s just a simple shopping cart site I hear you cry! Well, there’s a lot more going on behind the web interface:

  • Video production (rendering pipelines)
  • Video Subtitling
  • Newsletter Production
  • Customer Lists
  • Sales Data
  • Viewing Figures
  • System Administration
  • Support and Ticketing
  • International Currencies
  • International Business Rules (eg VAT, tax rates)
  • The usual CMS stuff (content management)
  • Standard eCommerce/Shopping Cart
  • Affiliate Management
  • Subscriptions billing and re-billing

Probably more. Our marketing manager loves nothing more than shoving in hideous cartesian join SQLs and exports to Excel, because – that’s what marketing managers do. It annoys the purist developers who think that their Hadoop based Viewing figures code is so beautiful it should be framed and put in an art gallery – but that’s a consequence of running a Monolith – we have different business areas with very different needs treading on one another’s toes.
We’re not ashamed of this monolith. It made absolute sense in our early days to deploy this as a single WAR file – it was the simplest thing that could possibly work, and work it did, for many years. But today, there’s so much complexity in there, it’s becoming more brittle over time. A simple tiny change (like a change in the VAT rate for a country) means a full rebuild (taking around 5 minutes) and then a complete reboot of the Java Server (Tomcat in our case).
A move to a Microservice based architecture would see us deploy multiple, small or tiny applications, each aligned to a specific area of the business.
In Part 2 of this series (next week), I’ll describe our first steps in migrating this monolith to Microservices. In this blog, I’ll talk about the general principles we should be adhering to.
In one sense, there’s nothing exciting in Microservices – it’s just good software engineering principles.

Loose Coupling / High Cohesion

At the core of Microservices is the same principle that is at the heart of any good software design. Loose coupling in this context means that a single microservice must do ONE thing, and do that ONE thing well. What “ONE thing” means is vague and is down to judgement, but key to a microservice is that it should be aligned to a specific area of the business. I’ve said that once already, but it’s of utmost importance. It’s absolutely no use making your Microservices align along tiers – if you have a “Web” microservice and a “Middle Tier” microservice and a “Database” microservice, then you don’t have microservices at all – you’ve got three monoliths with an enormous amount of coupling between them. Which brings me to…
…loose coupling, meaning that the dependencies between the services should be as minimal as possible. In development terms, a change to one microservice should have minimal impact on the other microservices in the overall system. In run-time operations terms, ideally, we should be able to take down an entire microservice with no degradation of the performance of the overall system.

Code Repository Isolation

So at one level a microservice is a simple expression of good software engineering, but it leads to difficult architectural choices. If your system is now a hundred microservices, should you have a source code repository that contains all of the microservices (with the services being in subfolders)? Or would you have to maintain a hundred seprate git/mercurial repositiories?
The answer is separate respositories – if you go with one huge repository, the temptation will be to start doing mega-builds and this will lead to “lock step” deployments where all 100 microservices are deployed at the same time – you therefore have much of the unpleasantness of a monolith. There are in fact many successful Microservice projects that do keep single repositories, this is something of an “ideal” goal, and it’s probably not a killer – but it makes sense that services which are deployed independently should also be developed independently.
In a similar vein, is it ok to deploy all of your microservices to a single server/VM instance/EC2 Instance/Azure Thingy?

Service Isolation

Ideally a microservice should be deployed onto it’s own standalone “instance”. Many projects do deploy multiple their microservices to a single machine, but again, this can lead to the temptation of coupling them together, leading again to “lock step” deployment.
This can be expensive, which is where container services such as Docker step in. A container can be thought of as much lighter than a Virtual Machine – a single VM can host multiple containers, each container responsible for a single microservice.

We love Docker at VirtualPairProgrammers, and yes, we will be doing a course on it!

Automate, Automate, Automate

You might be able to put up with the pain of manually deploying a monolith. Or manually spinning up a few Cloud Instances to host it. Or manually installing software onto those instances. Some people like pain, especially if there’s a bit of drama involved. Scaling that up to 100 deployments, 1000 instances, forget it. Microservices absolutely depend upon the automation of deployment, of provisioning, of configuration management. Continuous Delivery (http://martinfowler.com/bliki/ContinuousDelivery.html) is a prerequisite.

 

No Integration Databases

This is my favourite principle of Microservices – there should be no Integration Databases – avoid them at all costs. (For a recap of integration vs application databases, you can watch a chapter from our NoSQL course here).
There will be much wailing and gnashing of teeth over this – the integration database is the most precious possession of many businesses. But a database into which anything can delve in, read and write at will, is both incohesive (ie it captures many different parts of the business, by definition) AND it is tightly (not loosely) coupled (again by definition, as many disparate applications DEPEND upon it).
So it’s unarguable really, that integration databases have no part in microservices, it’s part of the definition. However, in the real world, expect to see many projects proudly proclaim that they use microservices, of which one “micro”service is the “database service”. Which you can’t change without the permission of the DBA. Oh look, there’s a “Business Logic” Microservice!
These are just my unstructured thoughts about the main ingredients of a Microservice – in part 2 (next week), I’ll describe a concrete example of how we at VirtualPairProgrammers are slowly migrating our IT across to a Microservice architecture.

 

How (Not) to Design a REST API

On my last project I had to integrate our code with an external REST provider. The provider was a banking service (I’ll call them “TheBank” to protect their identity) and we had to record financial transactions with them.

Check out the API documentation that we had to work with here (note: I’ve completely changed the API terminology so that the actual provider can’t be identified, but the structure and the errors are still the same).

A good Java interview question would be – what have they done wrong in this REST API design? Have a read of the docs before reading on, see if you can come up with a list of what could be improved.

 

(Plug: our JavaEE with Wildfly and Spring Remoting courses both explore good REST API design.)

Ok, you’re back and hopefully you’re face-palming. There’s a quick answer – it’s all wrong. I can’t find a single good decision in that entire API. Good going, TheBank.

Designing APIs is hard, admittedly, but whoever put this together hasn’t even grasped the fundamentals of REST (or HTTP). But this isn’t an isolated case – I’ve lost count of how many times I’ve had to integrate with similarly broken APIs – in fact it would be much quicker to count the ones which *are* well implemented (the figure is not far north of “zero”). This is probably the worst I’ve seen.

I’m not a REST zealot by any means, in fact on our course I openly admit that HATEOAS is a bit of a lofty goal and it’s no disgrace if you don’t go that far. I don’t care about purity or satisfying some aesthetic goal. What I do care about is wasted time and development effort, and I care if I’m forced to write brittle and error prone code.

So let’s run through TheBank’s blunders and see why it matters:

No URIs or Representations

Leaving aside the dodgy looking “endpoint.shtml” (what does this even mean? SHTML was a server side include, some kind of Apache extension. Why do I as the client care about this?), they are routing every single API call in through a single URI. Thus they are immediately losing the expressiveness of URIs. The URIs *are* the API.

So rather than an API, we have a single method with a huge telescoping list of query parameters.

Even though they call their API “RESTful”, there’s no trace of any kind of representation. This means that all the data for every call has to be converted into a long series of query params, leading to the very ugly and unreadable construction.

[There’s nothing wrong with query params – we use them on our REST courses. But only for constraints or extra information that doesn’t belong in the representations. Example – if you only want the first 20 records in a query, then this would make a good query param.]

Why this matters: if done properly, I could have quickly coded up a “Transaction” class in my client and let my framework (I was using Spring) to convert to JSON. Instead I had to spend time string concatenating, always an error prone and tedious process.

Invalid use of GET. No use of HTTP verbs.

GET, by definition, is for “safe” and “idempotent” operations. Meaning, no changes to state and no side effects. The “record” method is of course recording new transactions, so this has violated the contract of GET.

They’re clearly unaware that other verbs exisit. POST should have been used for this non-idempotent operation, but update and delete would have also been needed to avoid the ugly use of “method=record”.

Why this matters: I had to be extremely careful to ensure that my get requests are issued once and once only. Every call made to this API looked more or less identical because the very important “method=” is buried in an unreadable list of query params.

Implementing their own authentication scheme

The very weird process of hashing your API Key and Secret is clearly an authentication mechanism that they’ve invented themselves. Why? HTTP has a specified and well understood form of authentication – Basic Authentication. Under the standard, all I would have to do is send a “Authorization” header like this:

Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l

The odd looking string there is the username (API key) and password (secret) separated by a colon and then base64 encoded.

Instead, they want me to SHA-256 my key and secret before sending in a weird custom header.

I imagine their reasoning here is that SHA-256 is a secure one way hash that can’t be intercepted and reversed. (The Base64 encoding is definitely not secure). This is totally unnecessary – if they’d simply mandated HTTPs, then the traffic would be automatically encrypted, including the username and password. My guess is they’ve had a business directive asking them to not insist on HTTPs, and they’ve tried to fix that by rolling their own (almost certainly broken) encryption scheme.

A fundamental rule of security is that you should never roll your own security scheme, because it *will* be flawed.

Why this matters: I don’t care that the bank might get hacked, but I do care about the wasted day I spent trying to comply with their weird hashing rules. If they’d done it properly, my REST Client would have been able to handle the key and secret through a simple method call.

Bad return codes

This one wins them the jackpot. Every one of their API calls returns “Success!” (HTTP 200), until you check the body string and find out it actually failed. So I’m forced to write client code like this:

ResponseBody response = rest.get("big ugly uri");
if (response.getEntity().equals("Transaction Suceeded"))
{
 // continue
}

YES – they have misspelled “Suceeded” (should be two c’s). So – when they fix this typo my code will instantly break. Thanks, Bank.

Why this matters: I had to spend a long time probing their API to find out the strings they’re returning. I now have brittle string checks which are very likely to break at any time they decide to change those strings. And they will.

In many REST textbooks, they get themselves excited about HATEOAS. Forget that, the basics of URIs, Representations, HTTP Verbs, HTTP Return Codes and Security are all fundamental. Not many get all of these right, and a huge number get them ALL wrong. Don’t be like the bank.

On our Spring REST course, we set a programming challenge, part of which is to design a REST API – you can see that video here – it’s a bit long but there are some interesting decisions to make. Subscribe and you get the full course!

How to attach a Debugger to a running Tomcat or Glassfish instance

This is a frequently asked question from many of our customers at Virtual Pair Programmers, so I thought a blog post to capture the details would be in order. I’ll focus on Tomcat and Glassfish here, because we use them on our courses – but the details are the same for other servers.

  • (for Glassfish) 1) Run the server under debug. The easiest way is to run as normal, then go to the admin console. Go to configurations -> server-config (not default config) -> JVM Settings. Click Debug Enabled. These options will be different on different glassfish versions (I’m on 4), but you should be able to find them. Check the port that the debugger will run on – it will be part of the debug options and usually the default is 9009

    (for Tomcat) 1) Add a JVM option called “agentlib” to your startup script. On our courses, we use a bootstrap script called startup.bat, and you can edit it to look like this:

    cd ./tomcat/bin/
    java -Dsun.lang.ClassLoader.allowArraySyntax=true -agentlib:jdwp=transport=dt_socket,address=9009,server=y,suspend=n -jar bootstrap.jar
    

    (note: we use a simple bat file for bootstrapping Tomcat on our courses to simplify support: if you’re not using this script, then you need to put the JVM options in a new file bin/setenv.bat. Full details can be found here: )
  • 2) Restart the server (in Glassfish, a link may appear at the top of the page you can click. Otherwise, run the stopserv script and then startserv)
  • 3) Remember to add breakpoints in your code where required *AND re-deploy*. I sometimes forget to do this and wonder why I don’t hit any breakpoints.
  • 4) Now you can attach Eclipse to the debugger:

    a) Debug icon -> Debug Configurations
    b) Click “Remote Java Application”
    c) click the tiny icon at the top left – it is for “new session”
    d) Enter the correct port number you noted earlier (we suggested 9009)
    e) click the debug button.

  • 5) I find this odd: you won’t see anything special at this stage, you have attached to the running server *in the background*. There won’t be a console window and you won’t switch to debug perspective.
  • 6) You now need to hit a breakpoint, so to do this, exercise your code. This may be visiting one of your webpages, or running a test harness.
  • 7) When your client code causes a break to trigger on the server code, your run should be interrupted with a request to switch to the debug perspective, and you can now step through the code as usual.

I hope that’s useful!

Writing a Custom HTTP Message Converter in Spring

The Spring Webservices course got so big that we had to cut a few minor topics, and I promised on the video that I would write some blog posts covering them. Here’s the first of them, how to write a “Custom Message Converter”.

You probably don’t need to do this very often – I’ve never had to do this “in real life”. But it is a useful exercise to get a better understanding of what those message converters are doing.

Recall that in Spring, a MessageConverter is a class that is capable of converting a regular Java domain object to a REST representation (and back again). Spring has a small set of default converters already built in, but the two main ones are for JSON (most common representation used in REST) and XML.

For this exercise, let’s assume that for some reason, our REST application needs to support YAML as well. YAML is Yet Another Markup Language (literally) that aims to be simpler than XML. It’s used a lot in Rails.

As a starting point, I’ve fired up the REST project that we built on the training course. I’ve also started up the standard Spring REST shell:

baseUri mywebapp
get /customers

< 200 OK
< Server: Apache-Coyote/1.1
< Content-Type: application/json;charset=UTF-8
< Transfer-Encoding: chunked
< Date: Thu, 12 Feb 2015 17:56:33 GMT
<
{
  "customers" : [ {
    "customerId" : "100029",
    "companyName" : "Acme",
    "email" : null,
    "telephone" : null,
    "notes" : "No Notes",
    "calls" : null,
    "version" : 1,
    "links" : [ {
      "rel" : "self",
      "href" : "http://localhost:8080/mywebapp/customers/customer/100029?fullDet

As on the course, if the client wants XML instead, they can change the accept headers:

headers set --name accept --value application/xml

And now we repeat the get request….

get /customers
> accept: application/xml

< 200 OK
< Server: Apache-Coyote/1.1
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Date: Thu, 12 Feb 2015 18:04:06 GMT
<
<<customers><customer><companyName>Acme</companyName
><customerId>100029</customerId> ... lots of XML snipped

But there is no YAML message converter installed by default in Spring….

headers set --name accept --value application/yaml
get customers

> accept: application/yaml

<406 NOT_ACCEPTABLE

So let’s write a YAML Message Converter!

Step 1: Add the JAR file for YAML

One Java YAML parser is called SnakeYAML (code.google.com/p/snakeyaml). You can download the JAR from there, but if you have done our course, I actually supplied this JAR in the “Additional JARs” folder. So pull it from there and add it to your build path.

This library is very easy to use. If you want to try it out, you can easily convert an object into YAML (and back again) in a test harness.

public class TestYaml 
{
 public static void main(String[] args)
 {
  Customer c = new Customer("10012", "Acme","Notes");
  
  Yaml yaml = new Yaml();
  System.out.println(yaml.dump(c));
 }
}

This gives an output like this:

!!com.virtualpairprogrammers.domain.Customer
calls: []
companyName: Acme
customerId: '10012'
email: null
notes: Notes
telephone: null
version: 0

Step 2: Write the converter

This is the bulk of the work. To write a message converter, extend the Spring AbstractHttpMessageConverter, and override the three methods as below.

  • readInternal() describes how Spring should convert the data (YAML) into a Java object.
  • writeInternal() is the opposite – it generates a YAML String from a Java object (this will be done in a similar way to our test above).
  • The supports() method is used to determine whether the converter actually supports conversion to and from the type of object in question. You might decide that you’re not going to support collections for example. We’ll simply return true and support any object.

In the constructor, we call the superclass constructor, which requires a MediaType object to denote what the HTTP media type is. We’re supporting application/yaml.

The implementations of the read and write methods are fairly routine, we’re just using the SnakeYaml library. It takes a bit of fiddling with the API of the HttpInputMessage and HttpOutputMessage classes to get what you need. In the read method, the getBody() method returns a standard Java InputStream, which luckily SnakeYaml can accept. In the write() method, we have to convert the YAML String into a byte array so we can send it to the write() method of the HttpOuputMessage. It’s all a bit fiddly but straightforward in the end.


public class YamlMessageConverter<T> extends AbstractHttpMessageConverter<T>
{
 public YamlMessageConverter()
 {
        super(new MediaType("application","yaml"));
 }
 
 @Override
 protected T readInternal(Class<? extends T> arg0, HttpInputMessage arg1)
   throws IOException, HttpMessageNotReadableException 
 {
   Yaml yaml = new Yaml(new Constructor(arg0));
   T object = (T)yaml.load(arg1.getBody());
   return object;
 }

 @Override
 protected boolean supports(Class<?> arg0) {
  return true;
 }
 
 @Override
 protected void writeInternal(T arg0, HttpOutputMessage arg1)
   throws IOException, HttpMessageNotWritableException 
 {
  Yaml yaml = new Yaml();
  String result = yaml.dump(arg0);  
  arg1.getBody().write(result.getBytes());
 }
}

Step 3: Register the converter

The magic that makes the default message converters automatically happen is the tag in your Spring configuration.

We can add our new YAML Converter into the this tag:

<!-- This will automatically switch on the default httpmessageconverters -->
 <mvc:annotation-driven content-negotiation-manager="contentNegotiationManager">
  <mvc:message-converters register-defaults="true">
   <bean class="com.virtualpairprogrammers.messageconverters.YamlMessageConverter"/>
  </mvc:message-converters>
 </mvc:annotation-driven>

Note: the “register-defaults=true” is needed – without it, the default converters will not be registered and you will end up with only the YAML one.

And that’s it. We can now deploy the application and test:

headers set --name accept --value application/yaml
get customer/100029

< 200 OK
< Server: Apache-Coyote/1.1
< Content-Type: application/yaml
< Transfer-Encoding: chunked
< Date: Fri, 13 Feb 2015 12:55:47 GMT
<
!!com.virtualpairprogrammers.domain.Customer
calls: []
companyName: Acme
customerId: '100030'
email: null
notes: No Notes
telephone: null
version: 1

Our representation is now in YAML.

I hope this exercise may prove useful to someone – to be honest I’m not really interested in YAML, the main point of the exercise is to get an understanding of what those mysterious HttpMessageConverters are doing!

Minor bug in our Webservices course

I’ve discovered a minor fault in our Webservices course. We supply a JSON file containing a data graph – and there’s a missing curly bracket! This is important because without it, any attempt to record a call via the REST Shell will fail with a JSON properties exception.

The file should look like this:

{"call":
 {"notes":"Customer called to complain about late delivery.",
         "timeAndDate":"2014-12-05T04:00:00Z"},

 "actions":[{"details":"Return call.",
             "requiredBy":"2016-12-09",
             "owningUser":"rac",
             "complete":false},
            {"details":"Check handled ok",
             "requiredBy":"2016-12-25",
             "owningUser":"rac",
             "complete":false}
           ]
}

The missing curly bracket is added to the end of the line with the timeAndDate.

Many apologies for the error, I hope it hasn’t caused too many problems.

Decent settings for DBCP Connection Pools

The Spring Framework course from 2009 is the first course that we’ve re-recorded at VirtualPairProgrammers. Although surprisingly little has changed in Spring since then, we felt it was time to polish the course up a little, to use the latest Spring 4, and in particular to use a more modern format for the video – with the second edition you’ll be able to view it on iPads and mobile devices, as with most of our other courses.

Note: everyone who bought the first edition of the course will automatically receive the second edition on the day of release – currently slated for around the 14 March 2014, but there may be delays as we complete the editing process.


I have actually made very few changes from the original. One area that I felt worthy of update was in our choice of connection pool. In the first edition we used the Apache DBCP connection pool, largely because it was the pool of choice at that time for the reference manual.

Since then, it’s fair to say that DBCP has come in for a lot of criticism, and other pools such as C3PO, Proxool or the Tomcat pool have become more popular.

There’s a great debate about this at StackOverflow (see here: https://stackoverflow.com/questions/520585 – a shame they closed the question as “not constructive” because it most certainly was constructive).

In the end, however, I decided to continue using DBCP for the second edition, partly to keep consistency with the old course, but also because actually DBCP isn’t that bad – we’ve used it successfully on several large scale projects with high traffic.

I think the biggest problem with DBCP is that the defaults are so poor. If you configure DBCP with just a driver, url, user and pass, then you’re going to end up with a  pool that soon locks up.

On the re-recorded version I alert the viewers to this, and tell you that you really need to tweak the pool to bring it to a performant level. But there isn’t time on the course to get bogged down in this, so I pointed the viewers to this blog post, where some more sensible values can be found.

Our default settings are:

  • maxActive = 150
  • maxIdle = 10
  • minIdle = 5
  • initialSize = 5
  • minEvictableIdleTimeMillis = 1800000
  • timeBetweenEvictionRunsMillis = 1800000
  • maxWait = 10000
  • validationQuery = “SELECT 1”
  • testOnBorrow=true
  • testOnReturn=true
  • testWhileIdle=true

And you set each of these properties in the Spring XML in the same way you set the driver etc. Eg <property name=”maxActive” value=”150″/>

I’m not saying these values are good for any application – you need to test, tweak and tune, but at VPP we use these settings as a starting point, and they are in fact the exact settings we currently have on our live site. Our live site isn’t exactly high traffic in the Facebook/Google sense, but we do get heavy traffic when we release a new course, so these settings should be reasonably good for most average websites.

Having said that, you can also switch to other pools quite easily, but I wanted to capture these defaults somewhere.