My colleague, Richard Chesterwood has just posted on his blog about a problem with Spring Boot 1.4... if you're getting an issue with unsatisfied dependencies, check it out:
https://richardchesterwood.blogspot.co.uk/2016/10/spring-boot-crashing-due-to-unsatisfied.html
Monday, 17 October 2016
Tuesday, 23 August 2016
Tomcat problems with Java 8
If you're doing any of our courses that use Tomcat, be aware that the latest update to Java 8 (1.8.0_91) seems to have broken JSP compilations for all versions of Tomcat up to and including 8. We're not sure why this is happening, but as a quick fix either use Java 1.8.0_77 or earlier, OR use Tomcat 9 which is confirmed to fully support Java 8.
(Note that Tomcat 9 is still in Alpha, so doing this carries some risk - the safest choice is to use an earlier Java).
Thanks to all those who have reported this, and you can also follow a Stackoverflow post at http://stackoverflow.com/…/spring-mvc-unable-to-compile-cla…
Thursday, 26 March 2015
Java Advanced Course now live!
Today's an exciting day - we've just put the Java : Advanced Topics course live on the Virtual Pair Programmers' website.
I'm really pleased with this course - I think it is going to be really helpful to lots of Java developers - it covers topics which you don't tend to learn about in most Java courses as they are that bit more advanced, but are vital for really good Java developers to know about
For example, we go into depth on how the LinkedHashMap actually works, what can go wrong when you're writing multi-threaded apps, and how to avoid it, and even how to load-test your application so that you can be sure it won't run out of memory when you put it onto the production server!
I hope you enjoy it!
I'm really pleased with this course - I think it is going to be really helpful to lots of Java developers - it covers topics which you don't tend to learn about in most Java courses as they are that bit more advanced, but are vital for really good Java developers to know about
For example, we go into depth on how the LinkedHashMap actually works, what can go wrong when you're writing multi-threaded apps, and how to avoid it, and even how to load-test your application so that you can be sure it won't run out of memory when you put it onto the production server!
I hope you enjoy it!
Monday, 9 February 2015
An update on Hadoop Versions
Our popular Hadoop for Java Developers course was recorded using version 2.4.0 of Hadoop. Since the course was released there have been some further releases of Hadoop, with the current version being 2.6.0.
There are no differences in the content that we cover on the course between the two versions of Hadoop, so the course is completely valid if you wish to use 2.6.0 or 2.4.0. In this blog post, however, I want to point out a reason to stick with version 2.4.0, and a couple of pointers that you should be aware of if you are going to use 2.6.0. I'll also mention the process to upgrade from 2.4.0 to 2.6.0.
Which Version of Hadoop should I use?
If you're starting to develop with Hadoop today then you might just want to download the latest version from the Hadoop website (2.6.0) and there is only really one reason that I can think of not to do this... and that is that Amazon's Elastic Map Reduce (EMR) service, which can be used to run Hadoop jobs "in the cloud" is not yet compliant with versions of Hadoop newer than 2.4.0.
Although the code that you'll write on the course is identical in both versions of Hadoop, if you compile your code with the 2.6.0 jar files you'll not be able to run it on EMR. For this reason we suggest you consider sticking with 2.4.0, at least while learning Hadoop, so that you can experience EMR (we cover how to set up and run an EMR job on the course). If you plan to use Hadoop on EMR in a production scenario then you must stick to 2.4.0 until Amazon update the EMR service to work with newer versions.
You can download a copy of version 2.4.0 from this link.
If I am going to use 2.6.0, what do I need to know?
The only things to be aware of if you wish to study the course with version 2.6.0 of Hadoop are:
(1) Your standard installation path will be opt/hadoop-2.6.0/ instead of /opt/hadoop-2.4.0/ so you'll want to change the references to that in the following two script files that are provided with the course:
startHadoopPseudo
startHadoopStandalone
(2) When you install hadoop, you'll edit either .bashrc or .profile - make sure you also put the reference to the correct folder name in here also. Also, you'll be creating symbolic links to the Hadoop configurations - again make sure you use the correct folder names when you set these up.
What happens if I want to upgrade from 2.4.0 to 2.6.0?
If you have been running with 2.4.0 and wish to upgrade to 2.6.0, you just need to do the following:
(1) Download and unpack the 2.6.0 files from the Hadoop website - place these in /opt/hadoop-2.6.0/
(2) Create the configuration folders under /opt/hadoop-2.6.0/etc as you did for Hadoop 2.4.0 (you can actually copy the configuration folders from your 2.4.0 installation as they'll be valid for 2.6.0)
(3) edit your .bashrc (linux) or .bash_profile (Mac) to change the location of the Hadoop files in the HADOOP_PREFIX and PATH variables from 2.4.0 to 2.6.0
(4) Close your terminal window and open a new one to ensure that the updated environment variables and path varaible are loaded.
(5) run the script resetHDFS - you must be in the Scripts directory to run this script - this will reformat the HDFS file system and will create the symbolic links needed to use the Pseudo configuration. After running this script, enter the JPS command and check that you have the various daemons running (namenode, datanode etc)
(6) Your code, compiled with 2.4.0 will work in 2.6.0 - if you wish to recompile with 2.6.0, remove all the Hadoop jar files from the build path, and then re-add them from the folders under /opt/hadoop/2.6.0/share/hadoop
There are no differences in the content that we cover on the course between the two versions of Hadoop, so the course is completely valid if you wish to use 2.6.0 or 2.4.0. In this blog post, however, I want to point out a reason to stick with version 2.4.0, and a couple of pointers that you should be aware of if you are going to use 2.6.0. I'll also mention the process to upgrade from 2.4.0 to 2.6.0.
Which Version of Hadoop should I use?
If you're starting to develop with Hadoop today then you might just want to download the latest version from the Hadoop website (2.6.0) and there is only really one reason that I can think of not to do this... and that is that Amazon's Elastic Map Reduce (EMR) service, which can be used to run Hadoop jobs "in the cloud" is not yet compliant with versions of Hadoop newer than 2.4.0.
Although the code that you'll write on the course is identical in both versions of Hadoop, if you compile your code with the 2.6.0 jar files you'll not be able to run it on EMR. For this reason we suggest you consider sticking with 2.4.0, at least while learning Hadoop, so that you can experience EMR (we cover how to set up and run an EMR job on the course). If you plan to use Hadoop on EMR in a production scenario then you must stick to 2.4.0 until Amazon update the EMR service to work with newer versions.
You can download a copy of version 2.4.0 from this link.
If I am going to use 2.6.0, what do I need to know?
The only things to be aware of if you wish to study the course with version 2.6.0 of Hadoop are:
(1) Your standard installation path will be opt/hadoop-2.6.0/ instead of /opt/hadoop-2.4.0/ so you'll want to change the references to that in the following two script files that are provided with the course:
startHadoopPseudo
startHadoopStandalone
(2) When you install hadoop, you'll edit either .bashrc or .profile - make sure you also put the reference to the correct folder name in here also. Also, you'll be creating symbolic links to the Hadoop configurations - again make sure you use the correct folder names when you set these up.
What happens if I want to upgrade from 2.4.0 to 2.6.0?
If you have been running with 2.4.0 and wish to upgrade to 2.6.0, you just need to do the following:
(1) Download and unpack the 2.6.0 files from the Hadoop website - place these in /opt/hadoop-2.6.0/
(2) Create the configuration folders under /opt/hadoop-2.6.0/etc as you did for Hadoop 2.4.0 (you can actually copy the configuration folders from your 2.4.0 installation as they'll be valid for 2.6.0)
(3) edit your .bashrc (linux) or .bash_profile (Mac) to change the location of the Hadoop files in the HADOOP_PREFIX and PATH variables from 2.4.0 to 2.6.0
(4) Close your terminal window and open a new one to ensure that the updated environment variables and path varaible are loaded.
(5) run the script resetHDFS - you must be in the Scripts directory to run this script - this will reformat the HDFS file system and will create the symbolic links needed to use the Pseudo configuration. After running this script, enter the JPS command and check that you have the various daemons running (namenode, datanode etc)
(6) Your code, compiled with 2.4.0 will work in 2.6.0 - if you wish to recompile with 2.6.0, remove all the Hadoop jar files from the build path, and then re-add them from the folders under /opt/hadoop/2.6.0/share/hadoop
Monday, 1 December 2014
Java 8 Time - choosing the right object
In the last blog post, we looked at the java.time library’s Instant and Duration objects in Java 8. In this second post, we’ll get an overview of some of the other objects within the java.time libraries.
The Instant object is defined in the JavaDocs as “An instantaneous point on the time-line.” There are other objects, related to the Instant, that might also be useful to us – in particular LocalDateTime, LocalDate and LocalTime and ZonedDateTime.
I’ll ignore LocalDate and LocalTime for the moment, and consider LocalDateTime, ZonedDateTime and Instant... all three of these appear to be quite similar, so it’s worth understanding how each differs.
2014-12-01T15:18:31.094Z
2014-12-01T15:18:31.109
2014-12-01T15:18:31.152Z[Europe/London]
Ignoring the nano second differences, the formats are interesting. The first, the Instant tells us that this code was ran at 3.18pm on 1st December 2014. The letter z at the end stands for Zulu time, which is also known as GMT (Grenwich Mean Time) or UTC (which stands for Coordinated Universal Time… I don’t know why it’s not abbreviated to CUT – I’ll leave you go to go searching on Wikipedia if you want to know more about this!) So the first item, the Instant, is 3.18pm on 1st December 2014 UTC.
The second result is the local date time – that’s a representation of my current time, which is 3.18pm on 1st December 2014, that's what my computer clock and calendar say. And the third item is the zoned date time, where we can again see that it’s 3.18pm on 1st December 2014, London time.
This all looks very neat because I’m in the UK, where the timezone is GMT (at least in Winter)… what would have happened if I was somewhere else in the world?
Well here’s what a person in Abu Dhabi who tried the same exercise would find...
2014-12-01T15:18:31.094Z
2014-12-01T19:18:31.109
2014-12-01T19:18:31.152+04:00[Asia/Muscat]
So now we can see that the Instant is the same point on the timescale as the person in London found, but their LocalDateTime was the time their clock shows (4 hours later than the person in London’s clock) and the ZonedDateTime matches the LocalDateTime but it has the time-zone embedded in.
Let’s dig a little deeper into the definitions of these objects…
An Instant is an actual point in time, expressed using UTC – a universal time scale. A ZonedDateTime is an actual point in time, but in a specific area of the world... so if you know the timezone you can convert reliably from an Instant to a ZonedDateTime. A LocalDateTime is a representation of the concept of a date and time, but doesn’t actually correspond to an actual point in time… or rather it can only be converted into an actual point in time if we ADD IN a timezone.
Although LocalDateTime objects don’t necessarily correspond to an actual point in time, you can still do all the “date mathematics” you might want to with them, such as adding or subtracting days, minutes or nanoseconds.
This could cause confusion on the dates the clocks go back an hour from GMT+1 to GMT... the change happens at 2am – this year it was on Sunday October 26th. What this means is that if I looked at my digital clock, the minutes on Sunday 26th October changed like this…
01:58
01:59
01:00
01:01
As a result, the time 01:30am happened twice on 26th October…. At least as far as my clock is concerned. But my clock operates on LocalDateTime (or it would if it were running Java) – if I had an InstantClock it would have looked like this:
00:58
00:59
01:00
01:01
And if had a ZonedDateTime clock it might have looked like this:
01:58 BST
01:59 BST
01:00 GMT
01:01 GMT
(BST stands for British Summer Time – it’s another way of saying GMT+1)
Now let’s suppose we were writing code that was going to run a critical piece of functionality each night at 1.30am. For example, maybe we are a bank, and it’s at 1.30am that our function starts that is going to calculate the daily interest on every customer's account. It would be important not to use a LocalDateTime object for this code, as it might run twice on 26th October (and not at all on 30th March 2014, the date that the clocks skipped an hour). In this example, I’d want to use an Instant - I want my critical process to run at fixed times on the time-line.
In fact if I was coding up a system for a bank, that was recording every transaction in every account, I’d probably want to store the date and time for each transaction as an Instant object… that way I know exactly when it happened…. And because bank transactions can occur between countries, it makes sense to know the date and time of each transaction based on UTC, and then translate that into a local date and time for each country’s reports.
So if we go on to write another piece of code that details, for example, the number of transactions by hour while the bank was open each day, for this we would want to use the ZonedDateTime object, to query transactions between fixed times in each location… this can tell us how many transactions occurred between 9am and 10am LOCAL time (the first hour of opening) in each country.
Now I think these are particularly useful, well LocalDate is. I often find I want to compare 2 dates ignoring their time values. I’ve traditionally used the Apache Commons DateUtils library in the past to help with this – it has a truncatedCompareTo method which allows you to say compare 2 dates but only consider (for example) the date part, or only consider the year and month.
Now you might think that you could do this with the Instant objects... consider this code for example:
Here we have two dates - the 1st December at 4.01pm, and the 17th December at 3.17pm. I'd like to know how many days there are between these two dates, ignoring the times. The answer should be 17 less 1 = 16.
However the output from my println is... 15. The reason is that there are actually 15 days, 23 hours and 43 minutes between these two dates and our .toDays() method tells us that's 15 days - it ignores the hours and minutes.
So how do we find out the real number of days, ignoring the time?
Well here's some code that will do this:
In this code, we're taking 2 different Instant objects, converting them into LocalDate objects (note that we need to first convert them to LocalDateTimes... and I've just used the system default's time zone for the conversion), and then looking at how many days there are between the two local dates. We get the answer we're looking for - 16.
But this introduces another new object - that of the Period. When we compared 2 Instants in the previous blog post, we used the Duration object - well if you try and compare 2 LocalDates with a Duration object you'll find it will compile but you'll get a rather horrible looking UnsupportedTemporalTypeException.
So I guess we need to finish by understanding what the Period object is...
Admittedly it's not that simple (nothing is really with Java Time) - there's a rather interesting note in the JavaDocs that says that a Duration of 1 day will always be 24 hours, but a period of 1 day might be 23, 24 or 25 hours due to the impacts of daylight saving time changes.
Confused? yes I am too - I guess until we have all been building lots of code that manipulate dates we won't be completely familiar with which is the best object type to use and when, but I do at least get a sense that the structure of java.time is comprehensive. So far I haven't found any aspect of date manipulation or date mathematics that I can't do with java.time, although it can sometimes take what feels like quite a lot of effort.
So has Java finally come up with a replacement for Date that works? Well I want to say yes... It's a cautious yes for now but if I'm honest, I like what I'm using!
The Instant object is defined in the JavaDocs as “An instantaneous point on the time-line.” There are other objects, related to the Instant, that might also be useful to us – in particular LocalDateTime, LocalDate and LocalTime and ZonedDateTime.
I’ll ignore LocalDate and LocalTime for the moment, and consider LocalDateTime, ZonedDateTime and Instant... all three of these appear to be quite similar, so it’s worth understanding how each differs.
What's the time?
A good starting place is to instantiate each object type with it’s now() method, to get the current time, and to print these out to the console to see what the print formats look like.. the code to do this is shown here, with the output below:Instant instNow = Instant.now(); LocalDateTime ldtNow = LocalDateTime.now(); ZonedDateTime zdtNow = ZonedDateTime.now(); System.out.println(instNow); System.out.println(ldtNow); System.out.println(zdtNow);
2014-12-01T15:18:31.094Z
2014-12-01T15:18:31.109
2014-12-01T15:18:31.152Z[Europe/London]
Ignoring the nano second differences, the formats are interesting. The first, the Instant tells us that this code was ran at 3.18pm on 1st December 2014. The letter z at the end stands for Zulu time, which is also known as GMT (Grenwich Mean Time) or UTC (which stands for Coordinated Universal Time… I don’t know why it’s not abbreviated to CUT – I’ll leave you go to go searching on Wikipedia if you want to know more about this!) So the first item, the Instant, is 3.18pm on 1st December 2014 UTC.
The second result is the local date time – that’s a representation of my current time, which is 3.18pm on 1st December 2014, that's what my computer clock and calendar say. And the third item is the zoned date time, where we can again see that it’s 3.18pm on 1st December 2014, London time.
This all looks very neat because I’m in the UK, where the timezone is GMT (at least in Winter)… what would have happened if I was somewhere else in the world?
Well here’s what a person in Abu Dhabi who tried the same exercise would find...
2014-12-01T15:18:31.094Z
2014-12-01T19:18:31.109
2014-12-01T19:18:31.152+04:00[Asia/Muscat]
So now we can see that the Instant is the same point on the timescale as the person in London found, but their LocalDateTime was the time their clock shows (4 hours later than the person in London’s clock) and the ZonedDateTime matches the LocalDateTime but it has the time-zone embedded in.
Let’s dig a little deeper into the definitions of these objects…
An Instant is an actual point in time, expressed using UTC – a universal time scale. A ZonedDateTime is an actual point in time, but in a specific area of the world... so if you know the timezone you can convert reliably from an Instant to a ZonedDateTime. A LocalDateTime is a representation of the concept of a date and time, but doesn’t actually correspond to an actual point in time… or rather it can only be converted into an actual point in time if we ADD IN a timezone.
Although LocalDateTime objects don’t necessarily correspond to an actual point in time, you can still do all the “date mathematics” you might want to with them, such as adding or subtracting days, minutes or nanoseconds.
Ok so how is this useful?
If you’re writing code that is going to manipulate dates, you now have a choice of what to use. In the United Kingdom, our clocks go forward for an hour in the summer, (known as daylight saving) so at some times in the year we are effectively operating at GMT and sometimes it’s GMT+1.This could cause confusion on the dates the clocks go back an hour from GMT+1 to GMT... the change happens at 2am – this year it was on Sunday October 26th. What this means is that if I looked at my digital clock, the minutes on Sunday 26th October changed like this…
01:58
01:59
01:00
01:01
As a result, the time 01:30am happened twice on 26th October…. At least as far as my clock is concerned. But my clock operates on LocalDateTime (or it would if it were running Java) – if I had an InstantClock it would have looked like this:
00:58
00:59
01:00
01:01
And if had a ZonedDateTime clock it might have looked like this:
01:58 BST
01:59 BST
01:00 GMT
01:01 GMT
(BST stands for British Summer Time – it’s another way of saying GMT+1)
Now let’s suppose we were writing code that was going to run a critical piece of functionality each night at 1.30am. For example, maybe we are a bank, and it’s at 1.30am that our function starts that is going to calculate the daily interest on every customer's account. It would be important not to use a LocalDateTime object for this code, as it might run twice on 26th October (and not at all on 30th March 2014, the date that the clocks skipped an hour). In this example, I’d want to use an Instant - I want my critical process to run at fixed times on the time-line.
In fact if I was coding up a system for a bank, that was recording every transaction in every account, I’d probably want to store the date and time for each transaction as an Instant object… that way I know exactly when it happened…. And because bank transactions can occur between countries, it makes sense to know the date and time of each transaction based on UTC, and then translate that into a local date and time for each country’s reports.
So if we go on to write another piece of code that details, for example, the number of transactions by hour while the bank was open each day, for this we would want to use the ZonedDateTime object, to query transactions between fixed times in each location… this can tell us how many transactions occurred between 9am and 10am LOCAL time (the first hour of opening) in each country.
LocalDate and LocalTime
As you might suspect from their names these objects store a date or a time in isolation… and in fact a LocalDateTime is actually a combination of LocalDate and LocalTime… and the LocalDateTime object has a toLocalDate() and toLocalTime() method to easily extract each component.Now I think these are particularly useful, well LocalDate is. I often find I want to compare 2 dates ignoring their time values. I’ve traditionally used the Apache Commons DateUtils library in the past to help with this – it has a truncatedCompareTo method which allows you to say compare 2 dates but only consider (for example) the date part, or only consider the year and month.
Now you might think that you could do this with the Instant objects... consider this code for example:
Instant instant1 = Instant.parse("2014-12-01T16:01:13.419Z"); Instant instant2 = Instant.parse("2014-12-17T15:17:22.305Z"); System.out.println(Duration.between(instant1,instant2).toDays());
Here we have two dates - the 1st December at 4.01pm, and the 17th December at 3.17pm. I'd like to know how many days there are between these two dates, ignoring the times. The answer should be 17 less 1 = 16.
However the output from my println is... 15. The reason is that there are actually 15 days, 23 hours and 43 minutes between these two dates and our .toDays() method tells us that's 15 days - it ignores the hours and minutes.
So how do we find out the real number of days, ignoring the time?
Well here's some code that will do this:
Instant instant1 = Instant.parse("2014-12-01T16:01:13.419Z"); Instant instant2 = Instant.parse("2014-12-17T15:17:22.305Z"); LocalDate d1 = LocalDateTime.ofInstant(instant1, ZoneId.systemDefault()).toLocalDate(); LocalDate d2 = LocalDateTime.ofInstant(instant2, ZoneId.systemDefault()).toLocalDate(); System.out.println(Period.between(d1,d2).getDays());
In this code, we're taking 2 different Instant objects, converting them into LocalDate objects (note that we need to first convert them to LocalDateTimes... and I've just used the system default's time zone for the conversion), and then looking at how many days there are between the two local dates. We get the answer we're looking for - 16.
But this introduces another new object - that of the Period. When we compared 2 Instants in the previous blog post, we used the Duration object - well if you try and compare 2 LocalDates with a Duration object you'll find it will compile but you'll get a rather horrible looking UnsupportedTemporalTypeException.
So I guess we need to finish by understanding what the Period object is...
Periods or Durations
Well the official definition is that whereas the Duration object works with time - that is hours, minutes and seconds, whereas a Period object works with dates - or rather years, months and days. In reality I find you can forget this and just remember that if you're working with LocalDate you must use Period, if you're working with Instant, LocalDateTime, or ZonedDateTime use Duration.Admittedly it's not that simple (nothing is really with Java Time) - there's a rather interesting note in the JavaDocs that says that a Duration of 1 day will always be 24 hours, but a period of 1 day might be 23, 24 or 25 hours due to the impacts of daylight saving time changes.
Confused? yes I am too - I guess until we have all been building lots of code that manipulate dates we won't be completely familiar with which is the best object type to use and when, but I do at least get a sense that the structure of java.time is comprehensive. So far I haven't found any aspect of date manipulation or date mathematics that I can't do with java.time, although it can sometimes take what feels like quite a lot of effort.
So has Java finally come up with a replacement for Date that works? Well I want to say yes... It's a cautious yes for now but if I'm honest, I like what I'm using!
Thursday, 20 November 2014
Do the new Java 8 Date and Time objects make 3rd party date libraries redundant?
This is the first of two blog posts which are a follow up to Virtual Pair Programmers’ popular Java Fundamentals training course. This course was written with Java 7, and while everything in the course is still valid for Java 8, I thought a blog post about dates and times was worthwhile.
There are other changes in Java 8, although I’d say that these don’t affect the fundamentals. The biggest change is the introduction of lambda expressions, and I’m currently working on an “Advanced Java” course for Virtual Pair Programmers, which will cover this amongst other topics…. more on that later!
In the Java Fundamentals course, we say that the Date library in Java has always been a messy affair, and that while the GregorianCalendar object can be useful, if you need to do any kind of date manipulation in Java, you probably will be using an external library. In the course we JodaTime which seems to be the go-to library for most developers.
However Java 8 introduces some new date and time functionality, through the new java.time library, so I thought I’d take a look and see whether this might now make JodaTime redundant. In this post we’ll look at how to manipulate dates and times in Java 8, and in the following post we’ll explore in more detail some of the different objects within the Java 8 date and time libraries.
So our starting point for this post is the most common operation I find myself writing code for when it comes to working with dates… and that is comparing two dates to see if they’re the same. When I teach this to new programmers, I sometimes use the following example, as a way to practice date manipulation:
Imagine that there’s a secret date that we’re trying to find out. We know it’s between 1st January 2000 and 31st December 2020.
To find out what the secret date is, we can only ask the question in the format “how does it compare to, 16th November 2012 at 6.15am”?, and we’ll get the answer “the secret date is before, after or matches that date”.
The implementation of this comparison question in JodaTime is really simple – we use the DateTime object to store the date, and use the compareTo function to compare 2 dates. This simple method does the job:
The method takes a date and time (sampleDate) as a parameter and returns a +1 if the date and time we supply is after the date we’re looking for (the secret date), a 0 if it matches, or a -1 if the date and time we supply is before the date we’re looking for.
So the work is, given that we can only ask this question, how do we find out what the secret date is?
To find the answer, we play a game. We know that 1st Jan 2000 is before the secret date, and 31st December 2020 is after it. So let’s pick a date in the middle – say 1st January 2010, and ask the question for that date. The answer comes back “the secret date is after 1st January 2010”.
Now we know the date lies between 1st January 2010 and 31st December 2020. So we’ll try a date between those two – perhaps 1st January 2015.
We get the answer that the secret date is before 1st January 2015, so we now know it falls between 1st January 2010 and 1st January 2015.
We can keep repeating this - choosing a date between our known upper and lower limits, and revising one of those limits each time until we find the date. Here’s the code that does this and the output it produces:
I think this is a useful exercise for students new to JodaTime as it shows us:
In a full lesson we would explore the other options to add days or weeks rather than seconds, but you get the idea. We cover a number of the main things that people tend to want to do with dates in 1 simple exercise.
So let’s now suppose we have Java 8 and are not allowed to use JodaTime – does the new Java 8 Time library allow us to achieve this task?
Well the good news is that the answer is yes, and in fact the code is almost identical to what we have above. The two key objects we need are:
Here’s the new code… the output is identical to the JodaTime version!
So it seems that for many date manipulation tasks the new Java 8 objects will meet our needs, and can be considered a success. JodaTime can do much more than just this, so it won’t be redundant just yet, but I think it will be an extra overhead we won’t need for basic date manipulation.
What we have seen so far is 2 of the Java 8 classes from the java.time library – the Instant and the Duration class. There are other useful classes in this library however, and we’ll look at some of these in the next post.
* To say that an Instant represents a Date and Time is not the full story - it's got a very precise definition and we'll be exploring that in the next blog post.
There are other changes in Java 8, although I’d say that these don’t affect the fundamentals. The biggest change is the introduction of lambda expressions, and I’m currently working on an “Advanced Java” course for Virtual Pair Programmers, which will cover this amongst other topics…. more on that later!
In the Java Fundamentals course, we say that the Date library in Java has always been a messy affair, and that while the GregorianCalendar object can be useful, if you need to do any kind of date manipulation in Java, you probably will be using an external library. In the course we JodaTime which seems to be the go-to library for most developers.
However Java 8 introduces some new date and time functionality, through the new java.time library, so I thought I’d take a look and see whether this might now make JodaTime redundant. In this post we’ll look at how to manipulate dates and times in Java 8, and in the following post we’ll explore in more detail some of the different objects within the Java 8 date and time libraries.
Manipulating Dates with java.time
So our starting point for this post is the most common operation I find myself writing code for when it comes to working with dates… and that is comparing two dates to see if they’re the same. When I teach this to new programmers, I sometimes use the following example, as a way to practice date manipulation:
Imagine that there’s a secret date that we’re trying to find out. We know it’s between 1st January 2000 and 31st December 2020.
To find out what the secret date is, we can only ask the question in the format “how does it compare to, 16th November 2012 at 6.15am”?, and we’ll get the answer “the secret date is before, after or matches that date”.
The implementation of this comparison question in JodaTime is really simple – we use the DateTime object to store the date, and use the compareTo function to compare 2 dates. This simple method does the job:
public int compareDate(DateTime sampleDate) { return sampleDate.compareTo( new DateTime(2012, 11, 16,6,15)); }
The method takes a date and time (sampleDate) as a parameter and returns a +1 if the date and time we supply is after the date we’re looking for (the secret date), a 0 if it matches, or a -1 if the date and time we supply is before the date we’re looking for.
So the work is, given that we can only ask this question, how do we find out what the secret date is?
To find the answer, we play a game. We know that 1st Jan 2000 is before the secret date, and 31st December 2020 is after it. So let’s pick a date in the middle – say 1st January 2010, and ask the question for that date. The answer comes back “the secret date is after 1st January 2010”.
Now we know the date lies between 1st January 2010 and 31st December 2020. So we’ll try a date between those two – perhaps 1st January 2015.
We get the answer that the secret date is before 1st January 2015, so we now know it falls between 1st January 2010 and 1st January 2015.
We can keep repeating this - choosing a date between our known upper and lower limits, and revising one of those limits each time until we find the date. Here’s the code that does this and the output it produces:
import org.joda.time.DateTime; public class Main { public static int compareDate(DateTime sampleDate) { return sampleDate.compareTo( new DateTime(2012, 11, 16,06,15)); } public static void main(String[] args) { DateTime lower = new DateTime(2000,1,01,0,0); DateTime higher = new DateTime(2020,12,31,23,59); int result = 100; int steps = 0; DateTime foundDate = new DateTime(); while (result != 0) { int difference = (int) ((higher.getMillis() - lower.getMillis()) / 2000); foundDate = lower.plusSeconds(difference); result = compareDate(foundDate); if (result == -1) { lower = foundDate; } else if (result == 1) { higher = foundDate; } steps++; } System.out.println("Found date " + foundDate + " in " + steps + " steps."); } }Found date 2012-11-16T06:15:00Z in 29 steps.
I think this is a useful exercise for students new to JodaTime as it shows us:
- how to compare two dates (using compareTo),
- how to determine the length of time between two dates (using getMillis), and
- how to add time to a date (using plusSeconds).
In a full lesson we would explore the other options to add days or weeks rather than seconds, but you get the idea. We cover a number of the main things that people tend to want to do with dates in 1 simple exercise.
So let’s now suppose we have Java 8 and are not allowed to use JodaTime – does the new Java 8 Time library allow us to achieve this task?
Well the good news is that the answer is yes, and in fact the code is almost identical to what we have above. The two key objects we need are:
- java.time.Instant – this represents a Date and Time* – we'll consider that the equivalent to our JodaTime DateTime object, and
- java.time.Duration – this static object has a between() method which can give us the difference between two Instants, and a toMillis() function which converts that difference to milliseconds…
Here’s the new code… the output is identical to the JodaTime version!
import java.time.Duration; import java.time.Instant; public class Main { public static int compareDate(Instant sampleDate) { return sampleDate.compareTo( Instant.parse("2012-11-16T06:15:00Z")); } public static void main(String[] args) { Instant lower = Instant.parse("2000-01-01T00:00:00Z"); Instant higher = Instant.parse("2020-12-31T23:59:00Z"); int result = 100; int steps = 0; Instant foundDate = Instant.now(); while (result != 0) { int difference = (int) (Duration.between(lower, higher).toMillis() / 2000); foundDate = lower.plusSeconds(difference); result = compareDate(foundDate); if (result == -1) { lower = foundDate; } else if (result == 1) { higher = foundDate; } steps++; } System.out.println("Found date " + foundDate + " in " + steps + " steps."); } }On the basis that you understand the JodaTime version, the new Java 8 version is straight-forward.
So it seems that for many date manipulation tasks the new Java 8 objects will meet our needs, and can be considered a success. JodaTime can do much more than just this, so it won’t be redundant just yet, but I think it will be an extra overhead we won’t need for basic date manipulation.
What we have seen so far is 2 of the Java 8 classes from the java.time library – the Instant and the Duration class. There are other useful classes in this library however, and we’ll look at some of these in the next post.
* To say that an Instant represents a Date and Time is not the full story - it's got a very precise definition and we'll be exploring that in the next blog post.
Monday, 13 October 2014
Hadoop Course - Understanding the Scripts
Over the last week or so we've had a few support calls asking questions about the scripts provided in chapter 5 of the course, that are used to switch Hadoop between standalone and pseudo-distributed modes.
This post will explain in a bit more detail what each script is and how it works. These are custom scripts that I've developed while working with Hadoop, so you won't (probably!) find them elsewhere on the internet, but I think they make the process of managing Hadoop configurations on a development machine really easy.
Until you've got through chapter 8 of the course not everything in this post will make sense, but feel free to contact me if you have any questions after reading this - or raise a support call through https://www.virtualpairprogrammers.com/technical-support.html
There are 4 scripts provided with the course:
(1) resetHDFS
This script is designed to clear down your HDFS workspace - that is to empty out all the files and folders in the Hadoop file system. It's like formatting a drive. What the script actually does is:
NOTES:
(1) You must be in the folder where the script is located to run this script. You should run it by entering the following command:
./resetHDFS
(2) The script contains a number of lines that must be run with admin privileges - these contain the word sudo. As a result, running this script will require you to enter your admin password 1 or more times. Although this might seem frustrating, you will not be running this script regularly - only when you wish to delete all your data, and then it's a quick and easy way to do it.
(3) Because this script creates the HDFS required file and folder structures, we use it to create them for the first time. When the course was first released there was a typing error - on line 2, sudo was misspelt sduo. This has been corrected but if you have downloaded a copy with the typo, you might wish to correct it!
(2) startHadoopPseduo
This script will switch Hadoop into Pseudo-distributed mode - if you're currently in standalone mode then this is the only script you need to run.
What the script actually does is:
This post will explain in a bit more detail what each script is and how it works. These are custom scripts that I've developed while working with Hadoop, so you won't (probably!) find them elsewhere on the internet, but I think they make the process of managing Hadoop configurations on a development machine really easy.
Until you've got through chapter 8 of the course not everything in this post will make sense, but feel free to contact me if you have any questions after reading this - or raise a support call through https://www.virtualpairprogrammers.com/technical-support.html
There are 4 scripts provided with the course:
(1) resetHDFS
This script is designed to clear down your HDFS workspace - that is to empty out all the files and folders in the Hadoop file system. It's like formatting a drive. What the script actually does is:
- stop any running Hadoop processes
- delete the HDFS folder structure from your computer
- recreate the top level HDFS folder, and set its permissions so that the logged on user can write to it
- run the hdfs format command - this will create the sub-folder structure needed
- restart the hadoop processes
- create the default folder structure within HDFS that's required for your pseudo-distributed jobs (/user/yourusername)
NOTES:
(1) You must be in the folder where the script is located to run this script. You should run it by entering the following command:
./resetHDFS
(2) The script contains a number of lines that must be run with admin privileges - these contain the word sudo. As a result, running this script will require you to enter your admin password 1 or more times. Although this might seem frustrating, you will not be running this script regularly - only when you wish to delete all your data, and then it's a quick and easy way to do it.
(3) Because this script creates the HDFS required file and folder structures, we use it to create them for the first time. When the course was first released there was a typing error - on line 2, sudo was misspelt sduo. This has been corrected but if you have downloaded a copy with the typo, you might wish to correct it!
(2) startHadoopPseduo
This script will switch Hadoop into Pseudo-distributed mode - if you're currently in standalone mode then this is the only script you need to run.
What the script actually does is:
- remove the existing symbolic link to the configuration directory
- create a new symbolic link to the configuration directory containing the pseudo-distributed configuration files
- start the Hadoop processes
(3) stopHadoop
This script simply stops the Hadoop processes - it should be run if you're in pseudo-distributed mode and are going to switch back to standalone mode. It doesn't change any configuration settings, it just stops the processes running.
(4) startHadoopStandalone
This script removes the existing symbolic link to the configuration directory, and creates a new symbolic link to the configuration directory containing the standalone files. Although I've called this script "startHadoopStandalone" it doesn't actually start anything, as no processes run in standalone mode.
So... which scripts do you need to run and when:
If you're in standalone mode and you want to be in pseudo-distributed mode, just run startHadoopPseudo
If you're in pseudo distributed mode and you want to be in standalone mode, first run stopHadoop and then run startHadoopStandalone
If you have just switched on your machine and want to run in either mode - just run the relevant startScript. In this instance you don't need to run the stop script because you have no running processes if you have just booted up.
Subscribe to:
Posts (Atom)