Thursday, December 20, 2012

PDI: Pass Parameters to Jobs/Transformations

I had been working on trying to get a process to run for each file. I used the Get File Names step followed by the Copy rows to result step. I had placed this in front of my Text file input step, which is where you define the file for further processing.

That method produced a stream (that's what it's called in PDI) with each and every file and each and every record in those files. If I were just loading that into a table, it would have worked. However, I was assigning a identifier to each file using a database sequence. I needed a sequence for each file, but I wasn't getting it.

With some help and pointers from the ##pentaho IRC channel, I found this post (more on that one in the future), Run Kettle Job for each Row. I downloaded the sample provided to see how it worked.

The calc dates transformation just generates a lot of rows. Not much to see there. The magic, at least for me, was in the run for each row job entry.

Specifically, the Write to log step. (I have this need to see things, since I don't understand everything about the tool yet, Write to log provides me that ability.)

See date, better, ${date}? That's how you reference parameter and variables.

I ran the job and watched the date scroll by. Nice. Then I tried to plug it into my job.

Zippo. Instead of seeing, "this is my filename: /data/pentaho/blah/test.csv" in the log output, I just saw "this is my filename:" Ugh. I went back to the sample and plugged in my stuff. It worked. Yay. Went back to mine, it didn't. Gah! I tried changing the names, then I'd just see "this is my filename: ${new_parameter_name}" so it wasn't resolving to the value.

Finally...after comparing XML for the sample file and mine and finding no real differences, I just about gave up.

One last gasp though, I went to the IRC channel and asked if there was some way to see the job or transformation settings. No one was home. I tried right-clicking to bring up the context menu and there was Job Settings

Job Settings brought up this one:

date is defined there. I checked mine. Nothing defined. Added filename to mine, ran it, Success!

No comments: