Unverified Commit 5f60ebd3 authored by Fabien Viale's avatar Fabien Viale Committed by GitHub
Browse files

Merge pull request #789 from fviale/master

RunAsUser documentation
parents eb650a2f 72b18e5e
......@@ -1154,6 +1154,7 @@ Execute the workflow by setting the different workflow's variables as described
| String (default="user").
| `MODEL_SERVICE_NODE_NAME`
| The name of the node where the service will be deployed. If empty, the service will be deployed on an available node selected randomly.
| String
3+^|*Task variables*
| `SERVICE_ID`
| The name of the service. Please keep the default value for this variable.
......@@ -1634,8 +1635,7 @@ image::AutoFeat_column_summaries.png[align=center]
=== Edit column names and types
A preview of the data is displayed in the *Edit Column Names and Types* as follows.
[[_Edit_column_names_and_types]]
image::AutoFeat_edit_column_names_and_types.png[align=center]
image::AutoFeat_edit_column_names_and_types.png["Edit column names and types",align=center]
It is possible to change a column information. These changes can include:
......@@ -1649,8 +1649,7 @@ It is possible to change a column information. These changes can include:
- _Coding Method_: The encoding method used for converting the categorical data values into numerical values. The value is set to *Auto* by default. Thereafter, the best suited method for encoding the categorical feature is automatically identified. The data scientist still has the ability to override every decision and select another encoding method from the drop-down menu. Different methods are supported by AutoFeat such as *Label*, *OneHot*, *Dummy*, *Binary*, *Base N*, *Hash* and *Target*. Some of those methods require specifying additional encoding parameters. These parameters vary depending on the selected method (e.g., the base and the number of components for BaseN and Hash, respectively, and the target column for Target encoding method). Some of those values are set by default, if no values are specified by the user.
[[_Edit_column_names_and_types]]
image::AutoFeat_edit_column_names_and_types_encoding_parameters.png[align=center]
image::AutoFeat_edit_column_names_and_types_encoding_parameters.png["Edit column names and types",align=center]
It is also possible to perform the following actions on the dataset:
......@@ -4274,7 +4273,7 @@ NOTE: Torchtext were used to preprocess and load the text input. More informatio
| Boolean (default=True)
|===
NOTE: PyTorch is used to build the model architecture based on https://pytorch.org/docs/stable/torchvision/models.html[AlexNet^].
NOTE: PyTorch is used to build the model architecture based on https://pytorch.org/vision/stable/models.html[AlexNet^].
===== DenseNet-161
......@@ -4282,7 +4281,7 @@ NOTE: PyTorch is used to build the model architecture based on https://pytorch.o
*Usage:* It should be connected to <<Train_Image_Classification_Model>>.
NOTE: PyTorch is used to build the model architecture based on https://pytorch.org/docs/stable/torchvision/models.html[DenseNet-161^].
NOTE: PyTorch is used to build the model architecture based on https://pytorch.org/vision/stable/models.html[DenseNet-161^].
.DenseNet-161_Task variables
[cols="2,5,2"]
......@@ -4302,7 +4301,7 @@ NOTE: PyTorch is used to build the model architecture based on https://pytorch.o
*Usage:* It should be connected to <<Train_Image_Classification_Model>>.
NOTE: PyTorch is used to build the model architecture based on https://pytorch.org/docs/stable/torchvision/models.html[ResNet-18^].
NOTE: PyTorch is used to build the model architecture based on https://pytorch.org/vision/stable/models.html[ResNet-18^].
.ResNet-161_Task variables
[cols="2,5,2"]
......@@ -4322,7 +4321,7 @@ NOTE: PyTorch is used to build the model architecture based on https://pytorch.o
*Usage:* It should be connected to <<Train_Image_Classification_Model>>.
NOTE: PyTorch is used to build the model architecture based on https://pytorch.org/docs/stable/torchvision/models.html[VGG-16^].
NOTE: PyTorch is used to build the model architecture based on https://pytorch.org/vision/stable/models.html[VGG-16^].
.VGG-16_Task variables
[cols="2,5,2"]
......
......@@ -748,7 +748,7 @@ The service is started using the following variables.
| Boolean
| `false`
| `PYTHON_ENTRYPOINT`
| This entry script starts the service and defines the different functions to deploy the model, scores the prediction requests based on the deployed model, and returns the results. This script is specific to your model. This file should be stored in the Catalog under the `model_as_service_resources` bucket. More information about this file can be found in the <<../PML/PMLUserGuide.html#_customize_the_service>> section.
| This entry script starts the service and defines the different functions to deploy the model, scores the prediction requests based on the deployed model, and returns the results. This script is specific to your model. This file should be stored in the Catalog under the `model_as_service_resources` bucket. More information about this file can be found in the link:../PML/PMLUserGuide.html#_customize_the_service[Customize the Service] section.
| Yes
| String
| `ml_service`
......@@ -768,7 +768,7 @@ The service is started using the following variables.
| Boolean
| `true`
| `YAML_FILE`
| A YAML file that describes the OpenAPI Specification ver. 2 (known as Swagger Spec) of the service. This file should be stored in the catalog under the `model_as_service_resources` bucket. More information about the structure of this file can be found in the section <<<../PML/PMLUserGuide.html#_customize_the_service>>.
| A YAML file that describes the OpenAPI Specification ver. 2 (known as Swagger Spec) of the service. This file should be stored in the catalog under the `model_as_service_resources` bucket. More information about the structure of this file can be found in the section link:../PML/PMLUserGuide.html#_customize_the_service[Customize the Service].
| Yes
| String
| `ml_service-api`
......@@ -864,7 +864,7 @@ The service is started using the following variables.
| Boolean
| `false`
| `PYTHON_ENTRYPOINT`
| This entry script starts the service and defines the different functions to deploy the model, scores the prediction requests based on the deployed model, and returns the results. This script is specific to your model. This file should be stored in the Catalog under the `model_as_service_resources` bucket. More information about this file can be found in the <<_customize_the_service>> section.
| This entry script starts the service and defines the different functions to deploy the model, scores the prediction requests based on the deployed model, and returns the results. This script is specific to your model. This file should be stored in the Catalog under the `model_as_service_resources` bucket. More information about this file can be found in the link:../PML/PMLUserGuide.html#_customize_the_service[Customize the Service] section.
| Yes
| String
| `dl_service`
......@@ -884,7 +884,7 @@ The service is started using the following variables.
| Boolean
| `true`
| `YAML_FILE`
| A YAML file that describes the OpenAPI Specification ver. 2 (known as Swagger Spec) of the service. This file should be stored in the catalog under the `model_as_service_resources` bucket. More information about the structure of this file can be found in the section <<_customize_the_service>>.
| A YAML file that describes the OpenAPI Specification ver. 2 (known as Swagger Spec) of the service. This file should be stored in the catalog under the `model_as_service_resources` bucket. More information about the structure of this file can be found in the section link:../PML/PMLUserGuide.html#_customize_the_service[Customize the Service].
| Yes
| String
| `dl_service-api`
......
......@@ -2618,17 +2618,43 @@ Also, you can find *Needed Nodes* in the Scheduler portal where the Scheduler st
image::neededNodes-scheduler.png[align="center"]
[[_run_as_me]]
[[_run_as_me_admin]]
== Run Computation with a user's system account
Configure a ProActive Node to execute tasks under a user's system account by ticking the
link:../user/ProActiveUserGuide.html#_run_computation_with_your_system_account[Run as me] box in the task configuration.
By default authentication is done through a password, but can also be configured using a SSH key.
ProActive workflow Tasks can be configured to run under a user's system account by ticking the
link:../user/ProActiveUserGuide.html#_run_as_me[Run as me] checkbox in the task configuration.
image::../images/StudioRunAsMe.png["RunAsMe in Studio",width=300]
The RunAsMe mode can also be configured globally on the ProActive server. In that case, all workflow tasks from all users will be executed in RunAsMe mode.
Global RunAsMe can be configured in file `PROACTIVE_HOME/config/scheduler/settings.ini`, by enabling the following property:
[source, properties]
----
# If true tasks are always ran in RunAsMe mode (impersonation). This automatically implies pa.scheduler.task.fork=true (other setting is ignored)
pa.scheduler.task.runasme=true
----
By default, authentication is done through a <<_using_password,password>>, but can also be configured to use a <<_using_ssh_keys,SSH key>> or <<_using_password_less_sudo,password-less sudo>>.
The impersonation mode is configurable for each ProActive Node, using the java property `pas.launcher.forkas.method`.
For example:
[source, bash]
----
./bin/proactive-node -Dpas.launcher.forkas.method=pwd # start a node using password impersonation
./bin/proactive-node -Dpas.launcher.forkas.method=key # start a node using SSH key impersonation
./bin/proactive-node -Dpas.launcher.forkas.method=none # start a node using password-less sudo impersonation
----
This property can also be specified in ResourceManager <<_node_source_infrastructures>> through parameters that allow adding extra arguments to the _java virtual machine_.
Find a step by step tutorial link:AdminTutorials.html[here].
For proper execution, *the user's system account must have:*
=== System Configuration
For proper execution, *the user's system account must have*:
* Execution rights to the PROACTIVE_HOME directory and all it's parent directories.
* Write access to the PROACTIVE_HOME directory.
......@@ -2652,46 +2678,71 @@ Example:
proactive-node -Djava.io.tmpdir=C:\TEMP
----
Extra configurations are also needed depending on the authentication method below.
=== Using password
The ProActive Node will try to impersonate the user that submitted the task when running it. It means the
username and password must be the same between the Scheduler and the operating system.
username and password must be the same between the ProActive Scheduler and the operating system.
=== Using SSH keys
To enable this authentication method, set the java system property `-Dpas.launcher.forkas.method=key` when starting a ProActive Node.
Example:
----
proactive-node -Dpas.launcher.forkas.method=key
----
A SSH key can be tied to the user's account and used to impersonate the user when running a task on a given machine (using SSH).
The .ssh/authorized_keys files of all machines must be configured to accept this SSH key.
The SSH key must require *no passphrase*.
Additionnally:
- The `.ssh/authorized_keys` files of all machines must be configured to accept this SSH key.
- The SSH key must not contain a *passphrase*.
- Using this method on Windows machines is not recommended, as it requires the installation and configuration of a SSH Server.
When login into the scheduler portal, the private key of the user must be provided, this can be done by selecting on the login dialog: `More options > Use SSH private key`.
To enable this method, set the system property `pas.launcher.forkas.method` to _key_ when starting a ProActive Node.
Alternatively, a user can also add its SSH private key to link:../user/ProActiveUserGuide.adoc#_third_party_credentials[Third Party Credentials] under the name `SSH_PRIVATE_KEY`.
This method allows the user to login normally to the ProActive portals, without the need to enter its private key each time.
Finally, users can also authenticate to the portals with credential files.
The user must first create a credential file containing the user login, password and SSH key.
Run the following command on the ProActive server to create a credential file:
Example:
----
proactive-node -Dpas.launcher.forkas.method=key
$ <PROACTIVE_HOME>/tools/proactive-create-cred -F <PROACTIVE_HOME>/config/authentication/keys/pub.key -l username -p userpwd -k path/to/private/sshkey -o myCredentials.cred
----
=== Using passwordless sudo
This command will create an encrypted file with *username* as login, *userpwd* as password, using Scheduler public key at
`config/authentication/keys/pub.key`
for credentials encryption and using the user private SSH key at
`path/to/private/sshkey`. The new credential will be stored in *myCredentials.cred*.
Once created, the user must connect to the ProActive portals using this credential file. For example, in the studio portal below:
This configuration, only availabe on linux nodes, allows the impersonation to be performed using passwordless sudo.
image::../images/StudioCredentialsLogin.png["Studio login with Credentials", ,width=300]
To enable it at the system level, edit the /etc/sudoers file to allow passwordless sudo from the account running the ProActive node
to any users which require impersonation. Passwordless sudo should be enabled for any command.
=== Using password-less sudo
This configuration, only available on linux or unix Nodes, allows the impersonation to be performed using password-less sudo.
To enable it at the system level, edit the `/etc/sudoers` file to allow password-less sudo from the account running the ProActive node
to any users which require impersonation. Password-less sudo should be enabled for any command.
For example, the following line will allow the proactive account to impersonate to any user:
----
proactive ALL=(ALL) NOPASSWD: ALL
----
To enable this configuration on the ProActive node, start it with the system property `pas.launcher.forkas.method` to _none_
To enable this configuration on the ProActive node, start it with the system property `-Dpas.launcher.forkas.method=none`
Example:
----
proactive-node -Dpas.launcher.forkas.method=none
----
[[_web_applications]]
== Configure Web applications
......
......@@ -158,6 +158,9 @@ pa.scheduler.startscripts.paths=tools/LoadPackages.groovy
# Size of parsed workflow cache, used to optimize workflow submission time
pa.scheduler.stax.job.cache=5000
# Size of the cache used to ensure that delayed jobs or tasks are scheduled at the precise date (without skipping seconds)
pa.scheduler.startat.cache=5000
#-------------------------------------------------------
#---------------- JOBS PROPERTIES ------------------
#-------------------------------------------------------
......
......@@ -122,7 +122,7 @@ ProActive Workflows supports tasks in many scripting languages. The currently su
link:http://groovy-lang.org/[Groovy, window="_blank"],
link:https://www.jython.org[Jython, window="_blank"],
link:https://www.python.org/[Python, window="_blank"],
link:https://jruby.org[JRuby, window="_blank"],
link:https://www.jruby.org[JRuby, window="_blank"],
link:https://docs.oracle.com/javase/8/docs/technotes/guides/scripting/index.html#jsengine[Javascript, window="_blank"],
link:https://www.scala-lang.org/[Scala, window="_blank"],
link:https://docs.microsoft.com/fr-fr/powershell/scripting/overview?view=powershell-5.0[Powershell, window="_blank"],
......@@ -1482,38 +1482,106 @@ To do so, a Generic Information is required for the desired task, with the key
`PRE_SCRIPT_AS_FILE`, and the path of the file that you want to save your pre-script as its value.
The path should contain a file name. If it is an absolute path, then the file will be stored in this absolute path.
If is is a relative path, then the file will be stored in the link:../user/ProActiveUserGuide.html#_local_space[Local Space].
If it is a relative path, then the file will be stored in the link:../user/ProActiveUserGuide.html#_local_space[Local Space].
If you don't give a specific extension in the path, the extension will be automatically assigned to the corresponding one of the language selected for this pre-script.
[[_run_as_me]]
=== Run Computation with a user's system account
=== Run computation with your system account
When workflow tasks are executed inside a <<_glossary_proactive_node,ProActive Node>>, they run by default under the system account used to start the Node.
It is possible to start a task under the job owner if the system is configured for that purpose.
There are 2 possible ways to run a task under user account
(in any case, the administrator should have
link:../admin/ProActiveAdminGuide.html#_run_as_me[set computing hosts] to authorize
one of the 2 methods):
The *RunAsMe* mode, also called _user impersonation_, allows to start a workflow Task either:
* Using your *scheduling login and password*: if computing hosts are configured and user is authorized to run a process under his login and password.
* Using an *SSH key* provided by the administrator: if computing hosts are configured, the administrator should have given user an SSH key.
- under the job owner system account (default _RunAsMe_).
- under a specific system account (_RunAsUser_).
User must first create a credential containing this key:
The *RunAsMe* mode is defined in the XML representation of a Task:
[source, xml]
----
$ PROACTIVE_HOME/tools/proactive-create-cred -F config/authentication/keys/pub.key -l username -p userpwd -k path/to/private/sshkey -o myCredentials.cred
<task name="Linux_Bash_Task"
fork="true"
runAsMe="true" >
<description>
<![CDATA[ A task, ran by a bash engine. ]]>
</description>
<scriptExecutable>
<script>
<code language="bash">
<![CDATA[
whoami
]]>
</code>
</script>
</scriptExecutable>
</task>
----
This command will create a new credentials with *username* as login, *userpwd* as password, using Scheduler public key at
`config/authentication/keys/pub.key`
for credentials encryption and using the private SSH key at
`path/to/private/sshkey`
provided by administrator. The new credential will be stored in *myCredentials.cred*
In the ProActive Studio, the RunAsMe mode is available in the Task Properties *Fork Environment* section.
image::../images/StudioRunAsMe.png["RunAsMe in Studio",width=300]
In order to impersonate a user, RunAsMe can use one of the following *methods*:
- *PWD*: impersonate a user with a login name and password (this is the default, it is also the only mode available on Windows Nodes).
- *KEY*: impersonate a user with a login name and SSH private key.
- *NONE*: impersonate a user with a login name and https://en.wikipedia.org/wiki/Sudo[sudo].
The above modes are defined by the ProActive server administrator when ProActive Nodes are deployed. It is possible to define for each Node a different RunAsMe mode.
The underlying operating system need to be configured appropriately in order to impersonate a task.
See link:../admin/ProActiveAdminGuide.html#_run_as_me_admin[RunAsMe section] in the administration guide for further information.
The default mode for RunAsMe is to impersonate as the current user with the user login and password.
Accordingly, the user's login and password inside ProActive must match the login and password of the user in the target machine's operating system.
It is possible though to impersonate through a different user, using a different password, or even using a different impersonation method as the one configured by default.
This possibility is given by adding <<_generic_information>> described in the below table to the workflow task.
Each of these generic information can be defined both at task-level or job-level. When at job-level, it will apply to all tasks of this workflow with RunAsMe enabled.
Other tasks without RunAsMe enabled will ignore these generic information.
[[_run_as_me_generic_info]]
.RunAsMe Generic Information
|===
|Name |Description |Example value
|`RUNAS_METHOD`
|Allows overriding the impersonation method used when executing the task. Can be `pwd`, `key` or `none`.
|`pwd`
|`RUNAS_USER`
|Allows overriding the login name used during the impersonation. This allows to run a task under a different user as the user who submitted the workflow.
|bob
|`RUNAS_DOMAIN`
|Allows defining or overriding a user domain that will be attached to the impersonated user. User domains are only used on Windows operating systems.
|MyOrganisation
|`RUNAS_PWD`
|Allows overriding the password attached to the impersonated user. This can be used only when the impersonation method is set to `pwd`.
|MyPassword
|`RUNAS_PWD_CRED`
|Similar to RUNAS_PWD but the password will be defined inside <<_third_party_credentials>> instead of inlined in the workflow. This method of defining the password should be preferred to RUNAS_PWD for security reasons. The value of RUNAS_PWD_CRED must be the third-party credential name containing the user password.
|MyPasswordCredName
|`RUNAS_SSH_KEY`
|Allows overriding the SSH private key attached to the impersonated user. This can be used only when the impersonation method is set to `key`.
|-----BEGIN RSA PRIVATE KEY----- +
MIIEowIBAAKCAQEAp1fwx6R40kIf (...) +
-----END RSA PRIVATE KEY-----
|`RUNAS_SSH_KEY_CRED`
|Similar to RUNAS_SSH_KEY but the private key will be defined inside <<_third_party_credentials>> instead of inlined in the workflow. This method of defining the SSH private key should be preferred to RUNAS_SSH_KEY for security reasons. The value of RUNAS_SSH_KEY_CRED must be the third-party credential name containing the SSH key.
|MySSHKeyCredName
|===
Once created, user must connect the Scheduler using this credential. Then, in order to execute your task under your account
set *runAsMe=true* in the task.
TIP: You can now use <<_third_party_credentials,third party credentials>> to store the SSH key with the special entry named *SSH_PRIVATE_KEY*.
[[_multi_node_task]]
=== Reserve more than one node for a task execution
......
......@@ -22,7 +22,7 @@ The Informatica workflow consists of a list of variables described in the follow
| Workflow
| Yes
| String
| e.g. `https://xxx.informaticacloud.com/test/test`
| e.g. `\https://xxx.informaticacloud.com/test/test`
| `USERNAME`
| Username to use for the authentication.
| Workflow
......
......@@ -108,17 +108,47 @@ The following table describes all available generic information:
|task-level
|\http://my-server/my-icon.png
|<<_result_metadata,content.type>>
|Asseign a MIME content type to a byte array task result
|Assign a MIME content type to a byte array task result
|task-level
|image/png
|<<_result_metadata,file.name>>
|Asseign a file name to a byte array task result
|Assign a file name to a byte array task result
|task-level
|image_balloon.png
|<<_result_metadata,file.extension>>
|Asseign a file extension to a byte array task result
|Assign a file extension to a byte array task result
|task-level
|.png
|link:../user/ProActiveUserGuide.html#_run_as_me_generic_info[RUNAS_METHOD]
|Allows overriding the impersonation method used when executing the task. Can be `pwd`, `key` or `none`.
|job-level, task-level
|`pwd`
|link:../user/ProActiveUserGuide.html#_run_as_me_generic_info[RUNAS_USER]
|Allows overriding the login name used during the impersonation. This allows to run a task under a different user as the user who submitted the workflow.
|job-level, task-level
|bob
|link:../user/ProActiveUserGuide.html#_run_as_me_generic_info[RUNAS_DOMAIN]
|Allows defining or overriding a user domain that will be attached to the impersonated user. User domains are only used on Windows operating systems.
|job-level, task-level
|MyOrganisation
|link:../user/ProActiveUserGuide.html#_run_as_me_generic_info[RUNAS_PWD]
|Allows overriding the password attached to the impersonated user. This can be used only when the impersonation method is set to `pwd`.
|job-level, task-level
|MyPassword
|link:../user/ProActiveUserGuide.html#_run_as_me_generic_info[RUNAS_PWD_CRED]
|Similar to RUNAS_PWD but the password will be defined inside link:../user/ProActiveUserGuide.adoc#_third_party_credentials[Third-Party Credential] instead of inlined in the workflow. This method of defining the password should be preferred to RUNAS_PWD for security reasons. The value of RUNAS_PWD_CRED must be the third-party credential name containing the user password.
|job-level, task-level
|MyPasswordCredName
|link:../user/ProActiveUserGuide.html#_run_as_me_generic_info[RUNAS_SSH_KEY]
|Allows overriding the SSH private key attached to the impersonated user. This can be used only when the impersonation method is set to `key`.
|job-level, task-level
|-----BEGIN RSA PRIVATE KEY----- +
MIIEowIBAAKCAQEAp1fwx6R40kIf (...) +
-----END RSA PRIVATE KEY-----
|link:../user/ProActiveUserGuide.html#_run_as_me_generic_info[RUNAS_SSH_KEY_CRED]
|Similar to RUNAS_SSH_KEY but the private key will be defined inside link:../user/ProActiveUserGuide.adoc#_third_party_credentials[Third-Party Credential] instead of inlined in the workflow. This method of defining the SSH private key should be preferred to RUNAS_SSH_KEY for security reasons. The value of RUNAS_SSH_KEY_CRED must be the third-party credential name containing the SSH key.
|job-level, task-level
|MySSHKeyCredName
|<<_python_command,PYTHON_COMMAND>>
|Python command to use in <<../user/ProActiveUserGuide.adoc#_python,CPython script engine>>.
|job-level, task-level
......
......@@ -250,7 +250,7 @@ ProgressFile.setProgress(variables.get("PA_TASK_PROGRESS_FILE"), 50);
| <<_fork_environment, environment>>, <<_pre_post_clean, pre>>, <<_script_tasks, task>>, <<_pre_post_clean, post>>, <<_pre_post_clean, clean>>, <<_control_flow_scripts,flow>>
| -
| *SSH private key*. Private SSH Key used at login. See <<_run_computation_with_your_system_account>>.
| *SSH private key*. Private SSH Key used at login. See <<_run_as_me>>.
| credentials.get( "SSH_PRIVATE_KEY" )
| $credentials_SSH_PRIVATE_KEY
| -
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment