Monday, December 23, 2013

Apache flume error - java.lang.NoSuchMethodError: twitter4j.FilterQuery.setIncludeEntities(Z)Ltwitter4j FilterQuery

If you set up your flume agent to get Twitter's souce code (A simple tutorial on how to setup Apache flume, HDFS, Oozie and Hive) and getting the following error in your flume-ng log file, you can fix it by trying the following steps:

Error message:
org.apache.flume.lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows.
java.lang.NoSuchMethodError: twitter4j.FilterQuery.setIncludeEntities(Z)Ltwitter4j/FilterQuery;
at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:139)
at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

1. Check flume-sources-1.0-SNAPSHOT.jar file and make sure FilterQuery.class is in this jar file:
# /usr/lib/jvm/jdk1.7.0/bin/jar tvf /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/flume-ng/lib/flume-sources-1.0-SNAPSHOT.jar | grep FilterQuery
  4451 Tue Nov 13 10:06:42 EST 2012 twitter4j/FilterQuery.class

2. If FilterQuery.class exists in flume-sources-1.0-SNAPSHOT.jar file, then verify the calss has the method setIncludeEntities
# jar xf ./flume-sources-1.0-SNAPSHOT.jar twitter4j/FilterQuery.class

Upload FilterQuery.class to showmycode.com and check for setIncludeEntities method.

3. If step 1 and 2 are fine, probably there is another jar file which has the same class, but does not have this method is getting picked by the flume agent java process and causing the issue. Check the fill class path when starting the flume agent. You can use the following command:
# /usr/bin/flume-ng agent start --conf /etc/flume-ng/conf/ -f /etc/flume-ng/conf/flume.conf -Dflume.root.logger=DEBUG,console -n TwitterAgent

or in Cloudera Standard:
"Services" -> "flume1" -> "Instances" -> "Processes" -> "Show Recent Logs" -> "Full stderr log".


Looking for "*/flume-ng/lib/" directory.

Check the content of the jar files for FilterQuery.class, in my case, the lib directory is "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/flume-ng/lib/"

# cd /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/flume-ng/lib/
# find . -name "*.jar" | xargs grep FilterQuery.class

Binary file ./search-contrib-0.9.1-cdh4.3.0-SNAPSHOT-jar-with-dependencies.jar matches

So we have another JAR file search-contrib-0.9.1-cdh4.3.0-SNAPSHOT-jar-with-dependencies.jar with the same class and conflicting with correct one in FLUME_CLASSPATH

One dirty solution is to rename "search-contrib-0.9.1-cdh4.3.0-SNAPSHOT-jar-with-dependencies.jar" to "search-contrib-0.9.1-cdh4.3.0-SNAPSHOT-jar-with-dependencies.jar.org", then restart the flime agent. You shouldn't see any "java.lang.NoSuchMethodError: twitter4j.FilterQuery.setIncludeEntities(Z)Ltwitter4j FilterQuery"

The better solution would be update "search-contrib-0.9.1-cdh4.3.0-SNAPSHOT-jar-with-dependencies.jar" file to the latest version, or make changes to the source file and re-generate the jar file.

For how to compile and build "flume-sources-1.0-SNAPSHOT.jar", please read:
http://tonylixu.blogspot.ca/2014/07/hadoop-how-to-compile-flume-sources-10.html

10 comments:

Abhishek Chaudhary said...

Hello Tony,

This error is troubling me from very long time. I am not able to figure out the solution. I don't have any files under /opt/cloudera/parcels/. The folder is simple empty. When I am starting the agent with debugger, this is the error I am getting (same as mentioned by you):


2014-01-14 14:42:00,749 (lifecycleSupervisor-1-0) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows.
java.lang.NoSuchMethodError: twitter4j.FilterQuery.setIncludeEntities(Z)Ltwitter4j/FilterQuery;
at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:139)
at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2014-01-14 14:42:00,751 (lifecycleSupervisor-1-0) [DEBUG - com.cloudera.flume.source.TwitterSource.stop(TwitterSource.java:152)] Shutting down Twitter sample stream...

I am using the latest cloudera version, 4.5 on windows azure centos 6 VM. I have checked flume-sources-1.0-SNAPSHOT.jar and it does contain the filter class. Can you please help?

Thanks

Tony Xu said...

Hi Abhishek, which version of Apache flume you use? The Cloudera version of the official Apache version? If you use official Apache version, you won't have anything in /opt/cloudera/parcels.

Pavan Kumar said...

Wow.. Thanks a lot for your post.. This resolved my issue !!

-- Pavan Kumar SP

Marquise de Chiffo-Nier said...

Thnaks for addressing this issue; I'm trying to resolve it myself and your post is rather useful.

Marquise de Chiffo-Nier said...

Thnaks for addressing this issue; I'm trying to resolve it myself and your post is rather useful.

PandoraBob said...

Thanks so much for this n_n

Kevin Lefevre said...

You need to recompile flume-sources-1.0-SNAPSHOT.jar from the git:https://github.com/cloudera/cdh-twitter-example

Install Maven, then download the repository of cdh-twitter-example.

Unzip, then execute inside (as mentionned) :

$ cd flume-sources

$ mvn package

$ cd ..

This problem happened when the twitter4j version updated from 2.2.6 to 3.X, they removed the method setIncludeEntities, and the JAR is not up to date.

PS: Do not download the prebuilt version, it is still the old.

Tony Xu said...

Tks Kevin for sharing.

Kishor Pinjarkar said...

Thanks man for your post. It really helps me...

Kishor Pinjarkar said...

Thanks a lot...