FileInterface

FileInterface (FI) is managing the file storage for Aurora. This is intended for direct access trough NFS, SMB/CIFS, HTTP etc. To simplify access control across different platforms the access control is implemented with unix execure only-mode and hard to guess "cookie" directory names. Knowledge of a cookie gives access to the relevant object.

In FI, an Aurora entity may be "dataset", a "subject", both or none of them. The entity type is not consulted. A dataset is simply an entity FI has created a dataset storage for. A subject is a entity with DATASET_CREATE or DATASET_READ right on a dataset.

The FI is split into two disinct parts:

Storage

This is where the actual data is stored, and may cmprise of different file systems for scalability ao. The different file systems should be available to the FI under a common base directory ($base), here called "/Aurora". Each file system has a name ($store) and contain two directorys "rw" and "ro"; The store should be available to the FI as tree entries under the base:

This is typically implementetd by NFS like in the example below, but may also be symlinks or loopback mounts if local.

mount -o rw fileserver:/export/aurora_storage01    /Aurora/fi-storage01
mount -o rw fileserver:/export/aurora_storage01/rw /Aurora/rw-storage01
mount -o rw fileserver:/export/aurora_storage01/ro /Aurora/ro-storage01

The "fi-" mount must be exported as no_root_squash since FI has to do priveleged operations on it.

View

All access is implemented as symlinks under $base/view. This tree has two functions: Keep track of where the datasets is currently stored, and who may access it.

Dataset structure

When a dataset is created, a storage is selected based on hints to the create method, and a cookie is generated. The resulting storage path to the dataset is like $base/rw-$storage/$scale/$entity/$cookie/. $scale is a scaling to avoid unlimited nmber of elements in one directory. The dataset consists of a data directory with the actual data, alongside other files and folders for metadata etc. The $scale part is derived from the $entity like '$scale = sprintf("%03d/%03d, int($entity/1000000) % 1000, int($entity/1000) % 1000)'. The data of the dataset should thus reside in a path like "/Aurora/fi-storage01/rw/000/034/34625/FtHugftvcRcrfdfdRD/data/". Unpriveleged access should be done trough the "rw" path "/Aurora/rw-storage01/000/034/34625/FtHugftvcRcrfdfdRD/data/".

On close, the dataset is moved from rhe "rw" to the "ro" storage directory, and unpriveleged access is thus done trough the read only ro-$store mount.

To keep track of the eksact location at any time, FI maintain a symbolic links under "$base/view". This is similary scaled so, "/$base/view/$scale/$entity" will allways be a relative symink to "/$base/$mode-$store/$scale/$entity", like

/Aurora/view/000/034/34625 -> ../../../rw-storage01/000/034/34625

Access structure

Any entity with DATASET_CREATE or DATASET_READ is considered a "subject". The subject is assigned an $keycode and represented by a directory in the view tree as $base/view/$scale/$entity-$keycode. This contain relative symlinks to the dataset folders of all datasets the subject is entitled to. So if entity 450 has keykode lkjKLjLJoihjlIj and access to 34625 there is a symlink like this:

/Aurora/view/000/000/450-lkjKLjLJoihjlIj/34625 -> ../../034/34625/FtHugftvcRcrfdfdRD

So knowlege to 450's keycode will give access to the dataset 34625.

Local user access

If a subject maps to a local user ($username), a directory with exclusive access for the user is created as "$base/view/user/$username". In this ther wil be a relatin\ve symlink "ALL" pointing to the subjets access directory like

/Aurora/view/user/bt/ALL -> ../../000/000/450-lkjKLjLJoihjlIj

The local user "bt" (entity 450) may this way access the dataset as /Aurora/view/user/bt/ALL/34625/ without knowledge to its keycode or the cookie of the dataset.

FI roles

Aurora server

This is the controlling service for dataset management. This is normally running without privileges, so FI is accesed trough a simple server/client interface. This is based om sudo for escalation, but may easily be adapted to ssh if running on a separat host.

Aurora client

A aurora client is an host that provide user access to the data, like login services, http or samba server etc. The clients need to mount the view directroy tree read only and any "$mode-" exports with relevant mode, all prefferably with root squash.

Note tath FI is not responsibel of data transport. Populating the datsets data-directory is done trough a Aurora server or client.

FI purger

This is an asynchrounus privileged process mainatining the access structure troug the purgepoll() and purge() methods.

FI sets (unimplemented)

This is a process that populate the users access directory with sets of relative symlinks to datasets ALL/ based on users wish.

Synopsis

Client usage

use FileInterface;
$client = FileInterfaceClient->new;
$datapath = $client->create($entity, $user, $parent);
$datapath = $client->close($id);
$datapath = $clinet->datapath($id);
$mode = $client->mode;
$result = $client->lint($id);
$client->remove($id);

Mainainace processes (priveleged)

use FileInterface;
my $fi = FileInterface->new;
$elapsed = $fi->purgepoll;
$elapsed = $fi->purge;
$result = $fi->lint(id);
$results = $fi->storelint;

Methods

yell([string, ...]))

Most methods return undefined on errors. Any error messages is returned as a list from the yell() method. Any parameters to the yell is added to the list, except for the first parameter which may have special meaning, the most important is "<" which clears the list and adds any subsequent parametes. Other codes can be found in the source. Yell prepend messages with the name of the calling method.

new(AuroraDB, base, http)

Return a new FileInterface object. The three parameters i optional and primarely for testing. In production any undefined parameter will default to reasonable values.

server()

Read method and parameters from STDIN and return result and yells on STDOUT as YAML. Primarely used to execute the following dataset methods with privileges.

All is pure wrappers for similar FileInterfaceDataset methods described below, except for create, close and datapath which return absolute paths (prepend with "$base/").

FI()

Return the FileInterface object itself, also for child FileInterfaceDataset objects. The following methods use FI() where relevant allowing inheritanse to child objects for methods conserning the parent FI object.

Settings()

Return the Aurora Settings object used for configuration.

adb()

Return the AuroraDB object.

dbi()

Return the DBI object of the AuroraDB

base()

Return the configiured base path.

absolute(path, [path, ...])

Join paths with "/" and prepend with $base.

flush()

Clear all data and objects cached in the FI object.

dataset(id)

Return the a FileInterfaceDataset object for the entity id. The object is new unless found in cache.

ensurepath(path, mode)

Make sure the path exists. Set mode on any newly created directory

newcookie()

Return an newly created cookie string for dataset cookie or subject keycard

Create a relative link to target using the shorthest relative path.

selectstore([hint, ...]);

Find the prefferred store according to the hints. The hints is entity id's, and these will be followed along their entity path for the first hit. The hints are typically the user requesting the creation and the parent of the dataset to be created. A hit is when a store with the entity number or "system.fileinterface.store" metadata as the store name is found.

storewcheck(store)

Check that a store is online and writable.

Takes a store name as parameter. Return 1 if online and rw/ is writable

May find new (empty) stores storeprobe() do not know about.

storeprobe()

Bring all known stores online.

Return all known stores as an hash. Key in the hash is "$mode-$store", value is stat() of the directory.

storescan([store, ...])

Scann the list of stores for datasets, and return a complete list. Store is here the $mode-$store mount.

If no list is given, all rw- and -ro trees present as well as view is scanned. storeprobe() is called prior to the scan to bring all known stores online. Note that any unknown stores (ie with only unknown datasets) has to be brought online in some other way to detect the unknown datasets, unless a "browse" automount option is in effect.

Return a hash with dataset id as key and "$mode-$store" as value;

storelint([store, ...])

Run lint() on all datasets found by storescan([store, ...]). Any parameters is pased unaltered to storescan();

Return a hash with dataset id as key, The values is the return hash from lint() on the dataset, possibly augmented by storelint();

grantpathview(subject, keycode)

Return the subject view path according to the parameters.

grantpath(subject, keycode, dataset)

Return the path for a grant. Unless already exixsting, the grantpathview() is crated and subject is registered.

purge()

Evaluate the actual permissions against registered grants, and do the neccecary adjustments. This is split into tree phases:

Return the elapsed time.

purgepoll()

Run purge() if any changes in the source.

Return the elapsed time from purge() or 0 if purge() is not run.

mapuserstring(userstring [,helper])

Maps a userstring to a local user for purge_user(). Optional helper parameter is the name of an external program that is passed the userstring, and is expected to return a normal passwd string which is plitted and returned. Without a helper, it expect the userstring to be of the form /^(\w+)\@ntnu.no$/, and return getpwnam($1);

FileInterfaceDataset;

Dataset methods is separated into a subclass. FileInterfaceDataset objects doe allways know its own own entity id and the FileInterface object it is created from.

new(FI, info)

Create a new dataset object from an info hash, typically from an SQL query.

Returns the newly created object.

get(FI, id)

Create and retuurn new dataset object.

Save or update a dataset database entry. Only defined parameters is updated.

id()

Return the id of an dataset

info()

return the info hash of the dataset. The fields is

The values is undefined for missing/unset fields, like fiprivate if no cookie is set.

All of theese may also be obtained as methods with the same name, like entity() etc.

check()

Return $self->datapath if defined and the path exists.

find

Return $self->datapath if check() or a meaningful view link is found.

mode2perm(mode)

Return the numeric permission assosiated with the mode.

create()

Create the a dataset for an entity.

Return datapath if created and online on exit, but yell if it did exists.

close()

Move a dataset from rw to ro mode and return the new dataset path. Creates a new cookie in the process.

remove()

Remove a dataset and return the id. Currently just renames it out of view.

recook() (unimplemented)

Set a new cookie and update all relevant links.

lint()

Clean up any discrepancys. Does also some auxilary tasks like maintaining html shortcuts etc.

Return a hash of the following lists:

FileInterfaceClient;

new([connect, ...])

Return a FileInterfaceClient object for privilege escalation. The optional connect parameters is passed to open2 if another connection escalation method than the default "sudo" is required. Should be transparent to FileInterface for the following methods:

This method is made noop in the client, since it is function now should be handled by an asynchrounous process:

The following methods is internal to the client: