Conclusion of Google Summer of Code 2016

This details my contributions towards Apache Derby project in Google Summer of Code program in 2016.

DERBY-6870 (Google Summer of Code 2016 : Derby bug fixing) is the Jira issue that has the description of Google Summer of Code project. All details of the project can be found in here.

Here I have given a brief summary on the each jira issue that I have worked during the summer.

  • DERBY-4555 — (Expand SYSCS_IMPORT_TABLE to accept CSV file with header lines.)

The purpose of this JIRA issue is to add a new feature to SYSCS_IMPORT_TABLE and SYSCS_IMPORT_DATA System procedures to accept CSV files with header lines. As an additional feature, SYSCS_IMPORT_DATA system procedure needs to accept column names (instead of column indexes) in the ‘COLUMNINDEXES’ argument of the system procedure.

These new features were added through following four sub tasks. I got a thorough understanding on how these system procedures works in Derby.

  • DERBY-6892 — (Create new SYSCS_IMPORT_TABLE_BULK procedure)

This sub task adds a new system procedure named SYSCS_IMPORT_TABLE_BULK which is a variant of the existing SYSCS_IMPORT_TABLE system procedure, but has an additional argument at the end that specifies the number of initial lines of data in the input file to be skipped. Here I got to know how to add a system procedure to Derby, and also I got more familiar with java vararg feature and how useful it is. I got to know about full upgrading and soft upgrading in Derby from one version to another.

Commits : https://svn.apache.org/r1751159

  • DERBY-6893 — (Create new SYSCS_IMPORT_DATA_BULK procedure)

This adds a new system procedure, named SYSCS_IMPORT_DATA_BULK, which is a variant of the existing SYSCS_IMPORT_DATA system procedure, but has an additional argument at the end that specifies the number of initial lines of data in the input file to be skipped. The initial lines are column header data at the start of the input file.

Commits : https://svn.apache.org/r1751852

  • DERBY-6894 — (Enhance COLUMNINDEXES parsing for SYSCS_IMPORT_DATA_BULK to recognize columns by Name)

The purpose of this subtask is to allow the COLUMNINDEXES argument to specify columns in the input file by column header “name”, as an alternate to column index number. Column header names can be specified as double-quoted strings, and you can mix-and-match indexes and names, so that COLUMNINDEXES could be specified as: 1,3,”LastName”,”FirstName”,7 . The behaviour is tested using Junit and I got a thorough understanding on Junit testing here.

Commits : https://svn.apache.org/r1752990
 https://svn.apache.org/r1753876

  • DERBY-6895 — (Add documentation for new SYSCS_IMPORT_TABLE_BULK, SYSCS_IMPORT_DATA_BULK procedures)

This sub task adds two sections to the Reference Manual describing the new SYSCS_UTIL.SYSCS_IMPORT_DATA_BULK and SYSCS_UTIL.SYSCS_IMPORT_TABLE_BULK system procedures. This also makes small clarifying changes to the documentation of the existing SYSCS_UTIL.SYSCS_IMPORT_DATA and SYSCS_UTIL.SYSCS_IMPORT_DATA_LOBS_FROM_EXTFILE system procedures. Here I was exposed to the process of adding documentation to Derby and to the markup language DITA.

Commits : https://svn.apache.org/r1753624

  • DERBY-6852 — (Allow identity columns to cycle (as defined in SQL:2003))

This change introduces the new CYCLE keyword to the syntax of a generated identity column. The CYCLE keyword, if specified, causes Derby to select the CYCLE option in the sequence object that is created internally to implement the identity column. In turn, this means that values can be generated for the identity column beyond the maximum value for the column’s datatype; when that maximum is reached, the sequence “cycles” and begins over again, (re-)generating

the values all over again. When a Derby sequence cycles, it now cycles to its minimum/maximum value (depending on whether it has a positive increment or a negative increment) rather than to its start value. Here I was exposed to SQL grammar and javacc that was new to me and got a thorough understanding on Identity columns in Databases. I got to know about release notes of a product release with this issue since we thought to add release note about the changes of cycling value.

Commits : https://svn.apache.org/r1756287
ttps://svn.apache.org/r1756297

  • DERBY-3181 — (is Nullable on ResultSetMetaData from DatabaseMetaData.getBestRowIdentifier values are opposite when there is no rows in Result Set vs. when there is a row.)

If an invalid scope argument was passed to DatabaseMetaData.getBestRowIdentifier method, Derby was returning a hard-coded “empty” row identifier. Since JDBC does not require that we return such a row identifier for an invalid scope argument, we chose to throw an exception with a message indicating that an invalid scope argument was passed. Here I leant to add error messages to Derby and got familiar with throwing exceptions.

Commits : https://svn.apache.org/r1745414
https://svn.apache.org/r1746487

  • DERBY-853 — (ResultsetMetaData.getScale returns inconsistent values for DOUBLE type.)

When a SQL statement contains arithmetic expressions, the result of the expression may be of a different type than the operands to the expression, due to type precedence rules. In some of these cases, Derby was reporting that the result column had a non-zero scale, although the result column was not of DECIMAL type. This change modifies the NumericTypeCompiler so that it only computes a non-zero scale for the result column when it is of DECIMAL type.

Commits : https://svn.apache.org/r1755133

  • DERBY-6550 — (Bulk-insert causes identity columns to cycle when they shouldn’t)

This bug couldn’t be reproduced and we checked this with several versions. Then We’ve added a new test case to the regression test suite to ensure that the problem does not creep back in. Here I got familiar with junit testing and got a thorough understanding on testing.

Commits : https://svn.apache.org/r1747486

  • DERBY-5585 — (Improve error messages used when Derby can’t find the class or method backing up a SQL routine or type)

This improves error message when user function can’t find class.

Commits : https://svn.apache.org/r1754348 
https://svn.apache.org/r1754588

  • DERBY-6391 — (remove unneeded object creation in newException() calls in releases >10.10)

After a study with the community about this issue, we’ve concluded that this issue was fully resolved by svn revision 1742858. This issue isn’t really a duplicate of DERBY-6856, but those fixes were all that was necessary with the deprecated constructors in Java 9.

  • DERBY-3600 — (Change replication methods in org.apache.derby.iapi.db.Database to throw StandardException instead of SQLException)

This issue is 8 1/2 years old, and we thought that changing the exception signatures at this point brings more risk than reward. After a considerable study, it was our opinion that the risks of changing this code at this time outweigh the potential benefits as we understand them. I got more familiar with various exception signatures in Derby with this issue.

  • DERBY-6752 — (AutoloadedDriver tries to load a non-existent class, AutoloadedDriver40)

This task removes the code that tries to load AutoloadedDriver40. That code was always failing, because the class was removed from the Derby code by revision 1494482, but the attempt to load the class was wrapped in an exception catch block so there were no symptoms of the failed class load Attempt.

Commits : https://svn.apache.org/r1754680

This is the summary of my Project in GSoC. I would like to thank to GSoC team and Derby Community for giving me a great experience throughout this summer. Special thanks to my mentor, Bryan pendleton for the immense help, encouragement and guidance for this development and Rick Hillegas for the support to make this project a success.

Like what you read? Give Danoja Dias a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.