Archive for the 'Java' Category

 

Java profiler, notes on tracking memory leaks..

Sep 10, 2008 in Eclipse, Java, JavaUsage

Free profiler for Eclipse .. this is no longer active.. use TPTP instead..

http://eclipsecolorer.sourceforge.net/index_profiler.html

Notes on hunting memory leaks..

http://www.szegedi.org/articles/memleak.html

Notes from ‘Lucene in Action’ ..

Sep 07, 2008 in Books, Java, lucene

Lucene In Action
Lucene in Action
ERIK HATCHER
OTIS GOSPODNETIC
MANNING
Greenwich

Ch. 1 – Introduction

  • Lucene is a high performance, scalable Information Retrieval (IR) library.
  • Lucene’s creator is Doug Cutting.
  • Creating an index – see ‘Indexer.java’ (in ‘Files’, top right tabs)
  • Indexing API:
    — IndexWriter
    — Directory (RAMDirectory)
    — Analyzer
    — Document
    — Field

  • Searching an index – see ‘Searcher.java’ (in ‘Files’, top right tabs)
  • Searching API:
    — IndexSearcher
    — Term
    — Query
    — TermQuery
    — Hits

Ch. 2 – Indexing

  • The Analyzer tasks:
    — Decompose text into tokens.
    — Remove ‘stop words’.
    — Reduces words to roots.
  • The ‘Inverted Index’ – an efficient method of finding documents
    that contain given words.
    In other words, instead of trying to answer the question “what words are contained
    in this document?” this structure is optimized for providing quick answers to
    “which documents contain word X?”
  • Lucene doesn’t offer an update(Document) method;
    instead, a Document must first be deleted from an index and then re-added to it.
  • Use ‘doc.setBoost(float)’ to adjust the importance of documents.
    Use ‘field.setBoost(float)’ to set level for fields.
  • Using indexable date/time fields to high resolution (milliseconds) may cause
    performance problems.
  • Use indexable numeric fields for range queries (store the size of email messages,
    for example).
  • Tuning indexing performance – system properties org.apache.lucene.X where X is:
    — mergeFactor – 10 – Controls segment merge frequency and size
    — maxMergeDocs – Integer.MAX_VALUE – Limits the number of documents per segment
    — minMergeDocs – 10 – Controls the amount of RAM used when indexing
  • Use ‘addIndexes(Directory[])’ to copy indexes from one IndexWriter to
    another – for example, from RAMDirectory to FSDirectory .
  • Limit Field sizes with maxFieldLength – default is 10K terms per document.
  • Optimizing an index
    — Merging segments
    — Optimizing an index only affects the speed of searches
    against that index, and does not affect the speed of indexing.
    — API invoke pattern:
    IndexWriter writer = new IndexWriter(“/path/to/index”, analyzer, false);
    writer.optimize();
    writer.close();
  • Ch. 3 – Search in applications

  • Scoring
    Factors:
    — tf(t in d) Term frequency factor for the term (t) in the document (d).
    — idf(t) Inverse document frequency of the term.
    — boost(t.field in d) Field boost, as set during indexing.
    — lengthNorm(t.field in d) Normalization value of a field, given the number of terms within the
    field. This value is computed during indexing and stored in the index.
    — coord(q, d) Coordination factor, based on the number of query terms the
    document contains.
    — queryNorm(q) Normalization value for a query, given the sum of the squared weights
    of each of the query terms.
  • Query types
    — TermQuery
    — RangeQuery
    — PrefixQuery
    — BooleanQuery
    — PhraseQuery
    — WildcardQuery
    — FuzzyQuery (the Levenshtein distance)
  • Ch. 4 – Analysis

  • Analysis operations:
    — Extract words
    — Discard punctuation
    — Remove accents from characters
    — Lowercase (also called normalizing),
    — Remove common words
    — Reduce words to a root form (stemming)
    — Change words into the basic form (lemmatization)

Java object deep copy..

Aug 29, 2008 in Java

This is the equivalent of C++ ‘memcpy’ for Java..

http://javatechniques.com/blog/low-memory-deep-copy-technique-for-java-objects/

And also, a faster variant:

http://javatechniques.com/public/java/docs/basics/faster-deep-copy.html

Simple popup display widget ..

Aug 11, 2008 in Java, JavaUsage

import javax.swing.JOptionPane;
import javax.swing.JFrame;

public class JPopup extends JFrame {

    public JPopup(String msg) {
        JOptionPane.showMessageDialog(this, msg);
        System.exit(0);
    }

    public static void main(String[] args) {
        if (args.length == 0) {
            System.out.println("usage: JPopup message");
            System.exit(1);
        }
        new JPopup(args[0]);
    }

}

Eclipse TPTP Setup with Tomcat ..

May 10, 2007 in Eclipse, Java, Uncategorized

Good info on setting up Eclipse TPTP (Tracing and Profiling Tools Project) with Tomcat.

http://www.symentis.com/w-howTo/Eclipse-TPTP-Tomcat-HowTo.pdf

tptp

A good article on the Eclipse profiler

Feb 15, 2007 in Eclipse, Java

http://www.theserverside.com/tt/articles/article.tss?l=EclipseProfiler

The profiler discussed is at

http://eclipsecolorer.sourceforge.net/index_profiler.html

or

http://sourceforge.net/projects/eclipsecolorer

Note: this does not work with IBM Java since it uses JVMPI

(The Java Virtual Machine Profiler Interface)

IBM Java garbage collection analysis..

Feb 13, 2007 in Java

IBM Java garbage collection is enabled via CLI option
-Xverbosegclog:<filename> .

To interpret the log, use the IBM Pattern Modeling and
Analysis Tool for Java Garbage Collector (PMAT) .

PMAT is located at


http://www.alphaworks.ibm.com/tech/pmat

Java 6 Troubleshooting Tools ..

Jan 30, 2007 in Java

From: http://java.sun.com/javase/6/webnotes/trouble/

http://java.sun.com/javase/6/webnotes/trouble/other/tools6-Unix.html

  • Monitoring Tools
  • Debugging Tools
    • HPROF profiler

      To invoke the HPROF tool: java -agentlib:hprof ToBeProfiledClass

      To print the complete list of options: java -agentlib:hprof=help

    • jdb
    • jhat
    • jinfo

      * Print command line flags and system properties for a running process,
      from a core file, or for a remote debug server.

    • jmap

      * Print shared object mappings for a process, a core file, or a remote debug server.

    • jsadebugd

      Serviceability Agent Debug Daemon, which acts as debug server.

    • jstack

      * Print stack traces of threads for a process, core file, or remote debug server

Notes (Part 3) from HeadFirst Java..

May 14, 2005 in Books, Java, Uncategorized




  • — Basic network read pattern:
    Socket socket = new Socket("someserver.com", 5000);
    InputStreamReader isr = new InputStreamReader(socket.getInputStream());
    BufferedReader reader = new BufferedReader(isr);
    String message = reader.readLine();
    

    — Basic network write pattern:

    Socket socket = new Socket("someserver.com", 5000);
    PrintWriter writer = new PrintWriter(socket.getOutputStream());
    writer.println("message to send");
    
  • — Basic thread pattern
    public class RunnableJob implements Runnable {
      public void run() {
        ...
      }
    }
    
    RunnableJob rj = new RunnableJob();
    Thread thread = new Thread(rj);
    thread.start();
  • — Thread data access synchronization
    — If we synchronize two static methods in a single
    class, a thread will need the class lock to enter either
    of the methods.
  • — Collections
    — ArrayList
    — TreeSet – elements sorted, no duplicates.
    — HashMap – name/value pairs.
    — LinkedList – better performance for insert and delete of elements.
    (better for large data sets)
    — HashSet – no duplicates, fast search by key.
    — LinkedHashMap – same as HashMap plus preserves order of addition.
  • — Basic sorting pattern:
    ArrayList slist = new ArrayList();
    slist.add("a string");
    slist.add("...");
    Collections.sort(slist);
    

    New classes to be used with ArrayList must implement
    “Comparable” (self compare).

    See the use of a comparator (call it MyCompare) which
    implements the compare(MyObject, MyObject)
    method, like this:

    Collections.sort(theList, new MyCompare());
    
  • — Collection types summary:
    — List – sequence.
    — Set – uniqueness.
    — Map – key search.
  • — HashSet duplicate check methods:
    hashCode(), equals().
  • — Control on polymorphic ‘collection type’ usage..
    public void takeAnimals(ArrayList<? extends Animal> animals) {
      ...
      animals.add(someAnimal); // <-- add is forbidden by the '?' wildcard
    }
    
  • -- Static nested classes have access to static variables
    of the enclosing class.

    -- Anonymous nested classes have peculiar syntax capabilities:

    button.addActionListener(new ActionListener() {
      public void actionPerformed() {
        System.exit(0);
      }
    }
    );
    

    ActionListener is not a class, is an interface but in
    this context the 'new' instruction means 'create an
    anonymous class and implement the ActionListener
    interface'.

  • -- Access levels and modifiers:
    -- public - access by anybody.
    -- protected - same package + subclasses in or out of
    the same package.
    -- default - same package only.
    -- private - same class only.
  • -- Enumerations - a set of constant values that
    represent the only valid values for a variable.

    public enum Members { JERRY, BOBBY, PHIL };
    public Members bandMember;
    ...
    if (bandMember == Members.JERRY) {
     ...
    }
    

Notes (Part 2) from HeadFirst Java..

May 08, 2005 in Books, Java, Software




  • — Some methods of class Object
    equals(), getClass(), hashCode(), toString().
  • — A way of thinking about superclasses vs interfaces:
    Superclasses implement types, interfaces implement roles.
  • — About the Constructor
    — It is used to initialize the state of a new object.
    — If none is written, the compiler will create a default one (with no args.)
    — If one is written (any one) the compiler will not supply the default one.
    .. ergo ..
    — If a no-arg one is wanted and there is at least another one, the default
    one must be written.
    — It is good to always supply a default one with default values.
    — Overloaded ones must have different arguments (just like any other method).
    — Argument list concept includes order and/or type of arguments.
    — If the nature of the class is such that the object MUST be initialized
    (such as the Color class), do not write a default one.
  • — Every constructor can call super() or this() but never both.
  • — To prevent a class from being instantiated into objects,
    mark the constructor ‘private’. For example, many ‘utility’
    classes have only ‘static’ methods and they should not be
    instantiated.
  • — Finals..
    — A final variable cannot have its value changed.
    — A final method cannot be overriden.
    — A final class cannot be extended.
  • — Serialization..
    — Write object basic pattern:

    .. class SomeClass implements Serializable ..
    SomeClass sc = new SomeClass();
    try {
      FileOutputStream fs = new FileOutputStream("sc.ser");
      ObjectOutputStream oos = new ObjectOutputStream(fs);
      oos.writeObject(sc);
      oos.close();
    }
    catch(Exception exc) {
      ...
    }
    

    — If a variable in a ‘Serializable’ class cannot (or should not)
    be serialized, it must be marked ‘transient’.

    — Read object(s) basic pattern:

    FileInputStream fis = new FileInputStream("foo.ser");
    ObjectInputStream ois = new ObjectInputStream(fis);
    Object one = ois.readObject();
    Object two = ois.readObject();
    ...
    SomeClass sc = (SomeClass) one;
    SomeOtherClass soc = (SomeOtherClass) two;
    ...
    ois.close();
    

    — Static variables are not serialized.

    — Class changes (versions) may break serialization.

  • — Java new IO (nio) features
    — IO performance improvements by using
    native IO facilities.
    — Direct control of buffers.
    — Non-blocking IO capabilities.
    — Existing ‘File[Input|Output]Stream classes
    use NIO internally. Some NIO features may be
    accessed via the ‘channels’.