Gridsphere Infrastructure

100%

tdar-todo

  1. handler

    learn gridsphere architecture (compiling a list here: [Developer]) and clean up Access/Excel upload file functionality. See if we can factor out dataset registration phase entirely into its own "submit portlet", use existing GEON dataset registration pages in CorePortlet as a start.

    Priority MEDIUM
    alee
    N/A
  2. handler

    split this page up into infrastructure vs. developer documentation

    Priority MEDIUM
    alee
    N/A
  3. handler

    investigate LifeRay as Portal software, http://www.liferay.com/web/guest/products/portal - use with Spring Web MVC Portlets? (switching to Struts 2.0 instead)

    Priority MEDIUM
    alee
    N/A
  4. handler

    Look at logs to see if there's any indication of why GEON is so sluggish. Find out where logs are.

    Priority MEDIUM
    alee
    N/A
  5. handler

    improve security/harden machines - get rid of root access and any unnecessary public system information

    Priority MEDIUM
    alee
    N/A
  6. handler

    set up Java 1.5 or later on portal and dev nodes

    Priority MEDIUM
    alee
    N/A
  7. handler

    set up development environment on gama node

    Priority MEDIUM
    alee
    N/A
  8. handler

    configure access from portal node to data node's postgres server.

    Priority MEDIUM
    alee
    N/A
  9. handler

    make deployment from fresh SVN checkout possible (still needs improvement)

    Priority MEDIUM
    alee
    N/A
  10. handler

    factor out dependencies on gridsphere into jar dependencies so CorePortlet, etc., just refer to the gridsphere/gridportlets/geonutils jarfiles instead. We are not going to be modifying the altered GEON gridsphere sourcecode, just depending on it.

    Priority MEDIUM
    alee
    N/A

Installing software and important system commands

  1. update /etc/sysconfig/rhn/sources to use a closer up2date mirror, e.g., http://mirrors.usc.edu/pub/linux/distributions/centos/ or http://mirror.stanford.edu/yum/pub/centos/
  2. up2date <package-name>, e.g., up2date -v subversion
  3. to set up yum, first need to do an up2date yum, then configure yum repos if necessary in /etc/yum.repos.d and then run a yum update or yum install|update foo
  4. chkconfig for setting up boot-time services

Setting up tomcat+apache integration via mod_jk

  1. download tomcat-connector-src
  2. standard ./configure --with-apxs=/usr/sbin/apxs && make && make install incantation
  3. create and set up mod_jk configuration files: /etc/httpd/conf.d/88mod_jk.conf which contains a reference to /etc/httpd/jk-workers.properties
  4. set up default virtual host on apache in /etc/httpd/conf/httpd.conf and add appropriate JkMount directives for the URLs we want to be handled by Tomcat.
  5. ensure that Tomcat is listening on port 8009 (or whatever port was specified in the httpd jk-workers.properties file) with an AJP13 connector.
  6. Had to fix existing /etc/httpd/conf/httpd.conf and change the DefaultType text/plain to DefaultType text/html
  7. restart httpd + tomcat
  8. Finally, http://portal.tdar.org/gridsphere works
  9. update: In order to get myProjects functionality working also have to JkMount /CorePortlet/* and /team/* - there may be more.

Configure databases

  1. dump/restore geonportaldb and geoncat databases, createuser geonportaluser
  2. make sure pg_hba.conf is set to trusted connections from localhost

Hardening the machines

  1. SSH: remove root ssh access, limit ssh access by IP/user for all nodes
  2. Apache: limit access to sysinfo apps by IP by adding new Directory directive(s). These include tripwire, phpSysInfo, phpMyAdmin, and ganglia.
  3. Data node does not need to have any webservices running. Stop httpd and remove it via chkconfig
  4. configure tripwire more... don't know much about it yet.

Deployment from SVN

  1. master ant build.xml at the top level should run ant deploy on all portlets in the portlets directory?

Software notes and issues

Java 1.5+ compliance

  1. CorePortlet has a dependency on Axis 1.2 which doesn't work with Java 1.5 due to enum keywords. (fixed by removing enum keyword and changing references to org.apache.axis.enum to org.apache.axis.constants and using axis 1.4 jar)
  2. CorePortlet does not compile due to a missing dependency on IdentityAuthorization (fixed by adding ogsa.jar and cog-jglobus.jar to CorePortlet/lib)
  3. reference to UUID ambiguous due to java.util.UUID - bad implementation in several classes that import package.name.* instead of fully expanding package imports. This is a general code smell in the codebase - import foo.com.* should not exist in a production codebase, with modern IDEs' (e.g., Eclipse) organize imports feature, there is no reason why this even exists.
  4. putting shared jar dependencies in a shared lib folder - there are many redundant jars in the portlets, and even in the gridsphere lib.
  5. separate Gridsphere source tree/jars from the actual portlets
  6. current deployment process appears to be - deploy gridsphere, gridportlets, and then all the individual portlets. Should attempt to write a script that goes through all portlets in the portlet directory and attempts to deploy them all? This is clumsy in ant, perhaps a perl/python/etc. script would be more suitable.

Improvements to the GEON codebase

  1. org.geongrid.sdsc.portlets.data.RegistrationPortlet is where all the URL mappings are placed for some reason. This should really be in an external config file somewhere. Furthermore, the mappings are reset/reinitialized on every request, as opposed to just once when the webapp starts up. See what alternative models, if any, are supported by Gridsphere.
  2. "GEON" is used all over the place. It would be nice if there were a central configuration file/registry that is used to provide the project name (geon, tdar, etc.), and everything just knows how to look it up. As it stands now we still need to fix a few places in the registration process where GEON is used instead of tdar.
  3. DataCatalog.java uses the catalog.database.connection to connect to the geon catalog and acts as a DAO for certain project-related metadata stored in the geoncat postgres database. We may want to replace this class entirely with our own
  4. import fully qualified package names instead of .*
  5. org.geongrid.sdsc.portlets.data.RegistrationHelpInfo has a hardcoded reference to http://www.geongrid.org/portal/resource_reg/help_info.xml - probably is not something we want/need.
  6. why does webapp/js/jsp have the same files as webapp/js (with some slight modifications to xtree.js, popcalendar.js, also jsp/portal_search_results.js does not exist in the js folder)

Making GEON compatible/installable with a clean version of tomcat

Tomcat 6

  1. change classpath to look in lib instead of common/lib (change from tomcat 5.X to tomcat 6)
  2. modify SportletContext and add String getContextPath() to delegate to context.getContextPath()

Bundled GEON portlets

attrauth

  1. used for authentication?

CorePortlet

  1. Cleaning up jar dependencies: Added a "shared.lib" property to the ant build.xml that points to a shared library folder with jars that are common to multiple portlets. Next up is cleaning up the jar dependencies so CorePortlet has just the appropriate jars in its lib folder.
  2. Cleaning up codebase:
    1. removed webapp/WEB-INF/classes/org - no reason source files should be in there, and generated classfiles should not be in source control.

forumportlet

gama

  1. handles user account registration/authentication - need to modify the email that gets sent out on new account registration

    GEONHyperlinkPortlets

GEONIframePortlets

LidarPortlet

PIAreaPortlet

portalstatusportlet

PortalUsePortlet

RssPortlet

SYNSEISPortlet

SystemPortlet

userprofilemanager

Questions

  1. Where should jarfiles live? I.e., if portlet A and portlet B have a dependency on log4j, where should log4j be placed?
  2. what is the purpose of gama's tomcat-non-secure and tomcat-secure tomcat containers?
  3. What's the best way to include team elements and any other 3rd-party libraries/dependencies?
  4. How to set up the database schema automatically?
  5. geon.properties exists in team/WEB-INF/classes and CorePortlet/WEB-INF/classes/geon.properties - is this necessary to include teamelements?
  6. is greceptor a necessary service?