DistributionQueue
- Module to handle an AURORA distribution queue.
use DistributionQueue;
# instantiate
my $q=$DistributionQueue->new();
# add a task
my $userid=3;
my $datasetid=104;
my $data="WhatEverWeWant";
my $task=$q->addTask($userid,$datasetid,$data);
# add task with optional tags
my $task=$q->addTask($userid,$datasetid,$data,mytag=>"value",myothertag=>"value");
# show added task id
if (defined $task) {
print "New task ID: $task\n";
}
# enumerate tasks
my $tasks=$q->enumTasks();
if (defined $tasks) {
print Dumper($tasks);
}
# change task phase to DISTRIBUTING
$q->changeTaskPhase($task,"DISTRIBUTING");
# get a task tag value
my $value=$q->taskTag($task,"alive");
# set a task tag value
$q->taskTag($task,"alive",time());
# get task data
my $data=$q->getTaskData($task);
# remove a task
$q->removeTask($task);
Module to handle an AURORA distribution queue. It enables to create new tasks, move tasks between various phases, read and write task tags, enumerate all tasks or tasks based on userid, datasetid, status, retry and/or timeout tags, remove tasks, get task data.
Each added task will create a folder and file hierarchy like thus:
/distributions/
/distributions/taskid/
/distributions/taskid/DATA (the data to operate on)
/distributions/taskid/phase_VALUE (the phase that the distribution is in)
/distributions/taskid/status_VALUE (the status the task has)
/distributions/INITIALIZED/taskid -> ../taskid (phase folder with symlink to actual task folder)
/distributions/ACQUIRING/taskid -> ../taskid (phase folder)
/distributions/DELETING/taskid -> ../taskid (phase folder)
/distributions/DISTRIBUTING/taskid -> ../taskid (phase folder)
/distributions/FAILED/taskid -> ../taskid (phase folder)
The various tags set on a task will be read out when enumerating the tasks. They can also be read or updated by using the taskTag()-method.
A specific task will only exist in one of the phase folders above at a time and move between them by using the changeTaskPhase()-method.
Module constructor. Instantiates a DistributionQueue-class.
Possible options are:
folder Folder where the distribution queue is located. Defaults to /local/app/aurora/distributions.
delimiter Delimiter between the values in task identifier. Defaults to comma ",".
Returns an instantiated class.
Adds a task to the distribution queue.
Input parameters are in the following order:
userid Userid of the user that initiated this task. Required.
datasetid Dataset id of the dataset that this task operates on. Required.
data Data associated with the task. This is the data used to perform the task. Required.
<tags> A HASH with optional tag=>value(s) for the task. Will be added at the same time as the other information ensuring the tags are there when the task is moved into production.
It will add the task in the distribution folder and put it into the INITIALIZED status awating to start another phase (controlled by the user of the module).
Returns task id upon success, undef upon failure. Please see the error()-method for more information upon failure.
Removes a task from the distribution queue.
Input is the task id to remove from the distribution queue.
Returns 1 upon success, 0 upon failure. Please check the error()-method for more information upon failure.
The method will remove all traces of the task, including its tags, data and so on. It will even attempt to locate the phase-symlink if the phase tag does not exist and remove it.
Gets the operational data of the task.
Input is the task id to get the data on.
Return value is the data upon success, undef upon failure. Please check the error()-method for more information upon failure.
Gets or sets a task tag.
Input is the task id and the tag name.
Returns the tag value on both get and set, undef upon failure. Please call the error()-method for more information upon a failure.
All tag names are lower case and all attempt at adding tags with other cases will be changed to lower case. The value can be any allowable character for a POSIX file.
Do not attempt to change the task tag "phase" yourself from this method. It is recommended to instead call the changeTaskPhase()-method that will handle it in the correct manner. If you do change it yourself, expect possible unwanted side-effects.
Adds a task file and its information
Input is in the following order: task, name and data.
Task is the id of the task, name is the name of the file to create in the task and data is the content of the newly created file. Any existing file is overwritten.
The method return 1 upon success and 0 on failure. Please check the error()-method upon failure...
Changes a tasks phase and status from one phase to another and moving the symlink and updating the phase and status tag.
Input is the task id and the new phase to set on it. Optionally a diverging status can be specified if it is different than the phase. Or else the status-tag is set to the same as the phase. In addition the last parameter to the method is "samephase". If set to false/0 it will not accept changing a phase from the same phase as the current phase and in so doing act as if it is atomic, preventing other processes to change the phase at the same time and thereby continue running. Default behaviour is to accept same phase change, so if option is to be used it must be set to a value that evaluates to false, eg. 0.
Returns the 1 upon success, 0 upon failure. Please call the error()-method for more information upon a failure.
The method moves the symlink file between the phase-folders and updates the phase- and status- tags.
Enumerates either all tasks that exists or moderated by input parameters.
Possible input parameters are in the following order:
userid Only match tasks that have this userid. Can be undefined.
datasetid Only match tasks that have this datasetid. Can be undefined.
status Only match tasks that have this current status in the status tag. Can be undefined.
retry Maximum retry. Only match tasks that have been retried less than this many times. Can be undefined.
timeout Time before a task times out when not having updated its alive-tag. Only match tasks where the alive tag time + timeout-option is less than current time. In other words the alive tag has not been updated after the timeout periode expired. The timeout value is specified in seconds. Can be undefined.
All these parameters will be and'ed together if specified. Returned datasets will need to match all specified parameter settings.
The method will return a HASH-reference upon success, undef upon failure. Please call the error()-method for more information upon failure.
The resulting HASH-structure is like this:
(
taskid => { userid => ID,
datasetid => ID,
random => RANDOMSTRING,
tags => { phase => INITIALIZED (or whatever other phase the user uses)
status => INITIALIZED (or whatever other status the user uses)
tagX => VALUE,
tagY => VALUE,
},
}
)
Not all of these parameters are necessarily present at once. The user of the class can add any tag(s) he wants.
This method has a special handling of the alive- (moderated by the timeout-option), status- (moderated by the status-option) and retry- (moderated by the retry-option) tags.
Moves a task outside and beyond the distribution queue.
Input is the task id to rapture from the distribution queue.
The tasks phase-link is removed and the task is then moved into the rapture folder "._rapture". There it is not visible anymore for the DistributionQueue-module. This is something that can be done to tasks that are eg. failed for some reason or other.
Returns 1 upon success, 0 upon failure. Please check the error()-method for more information upon failure.
Gets a tasks files, including DATA-file and tag-files (not . and ..).
Input is the task id.
Return value is LIST-reference with the filenames upon success, or undef upon failure. Please call the error()-method for more information upon failure.
Gets the tasks unique and random ID.
Input is the usual task id.
Returns the random ID upon success, undef upon failure. Please call the error()-method for more information upon failure.
The task random ID is the last 32 characters in the task id. These 32 characters are random and should be unique.
Gets the ID of user that owns the task.
Input is the usual task id.
Returns the user ID upon success, undef upon failure. Please call the error()-method for more information upon failure.
Gets the dataset ID associated with the task.
Input is the usual task id.
Returns the dataset ID upon success, undef upon failure. Please call the error()-method for more information upon failure.
Sets or gets the folder where the distribution queue is.
On get there is no accepted input. On set there is the folder location to set.
Return value is always the folder-value.
Sets or gets the delimiter used for the task id.
On get there is no accepted input. On set there is the delimiter to set.
Return value is always the delimiter.
Gets the last error string.
No input is accepted.
Returns the last error of the module.